blacklists are broken again.
Just today (24th of may 2016) I wasn't able to have my blacklisted images blocked. And I'm still not able to. Here's a list of all my blacklists.

spoiler

And now the hidden posts only block default blacklists like shota, death, seizure warning, and loli. I don't know why this is happening. If someone could help me out on this, I would very much appreciate it. Thank you.
-waverun
waverun said:
Just today (24th of may 2016) I wasn't able to have my blacklisted images blocked. And I'm still not able to. Here's a list of all my blacklists.

spoiler

And now the hidden posts only block default blacklists like shota, death, seizure warning, and loli. I don't know why this is happening. If someone could help me out on this, I would very much appreciate it. Thank you.
-waverun
queen_chrysalis

That'd explain post #35761. I kept waiting for you to go on a rant, but you never came. hint, it involves Bug Butt, and implied Tia.

I don't know what's happening, but I gotta say, that's a loooooooooooong list. I don't really care for blacklisting, but I can definitely see why it's a problem if it's down.

sunset_shimmer
sonata_dusk
Really? Any reason why?
Well there's your problem. Youve blacklisted everything.
Changer said:
Well there's your problem. Youve blacklisted everything.
That a fact, or a bash? Does having a huge blacklist break it or something? First time I've heard of it.
Given the size of your blacklist, I'd guess it's the same issue that was cropping up with large blacklists before. It doesn't seem like Vanndril has found a fix for it yet, either.
how the fuck did it get so big i mean holy shit
Pinkanator said:
sunset_shimmer
sonata_dusk
Really? Any reason why?
I almost couldn't stand Equestria Girls, especially most of the villains in it, but now, just for the sake of hypno-porn, I might try to remove those tags.
Okay, I've got it fixed.

Here's the updated version of my blacklists.

spoiler
Trust me, I have ALOT of standards.
waverun said:
Trust me, I have ALOT of standards.
Do they involve spaghetti?

No, to be fair, most on that blacklist is reasonable. The problem comes when those things are rather rare, meaning the blacklist ends up being redundant most of the time.
Pinkanator said:
queen_chrysalis

That'd explain post #35761. I kept waiting for you to go on a rant, but you never came. hint, it involves Bug Butt, and implied Tia.
Yeah, there's just some thing about her that just completely destroys the purity I have, little or big.

At least when I deceive people (which in all honesty is not-too often), I do it as MYSELF!
waverun said:
Yeah, there's just some thing about her that just completely destroys the purity I have, little or big.
Was that supposed to sound fetishy, or am I just more of a pervert than I realized?

waverun said:
At least when I deceive people (which in all honesty is not-too often), I do it as MYSELF!
Also, yeah I get why you don't like that. And the whole thing with your mistress and whatnot. She did fuck with Lyra, so y'know. Not in my best books. She's just... y'know.
Pinkanator said:
Was that supposed to sound fetishy, or am I just more of a pervert than I realized?
No, it was supposed to mean that it destroys all of the purity and goodness of heart I have (whether big, or little) in terms of kindness to others, specifically her, and toleration of others, specifically her.
waverun said:
No, it was supposed to mean that it destroys all of the purity and goodness of heart I have (whether big, or little) in terms of kindness to others, specifically her, and toleration of others, specifically her.
Ah. Sounds cool. Sounds edgy. I like it.

spoiler
Pinkanator said:
That a fact, or a bash? Does having a huge blacklist break it or something? First time I've heard of it.
Sort of in the middle. An extremely long list might take so long to process that the processing just fails. I was intending to have a sort of joking tone though; not a bashing one.
Dreamshade said:
Given the size of your blacklist, I'd guess it's the same issue that was cropping up with large blacklists before. It doesn't seem like Vanndril has found a fix for it yet, either.
This. There is definitely something wrong with the blacklist system, and it's most likely to have something to do with size (though I can't figure out what KIND of size - I've ruled out tag count, bytes/words stored in the blacklist field of the database, and a few others).

Days of effort (as in, something like 50 actual hours) on my part have gone into actively looking into this, albeit not recently. And I still have no idea what's causing the problem. To be honest, I'm not really hopeful that there will ever be a real solution, either...

waverun said:
Okay, I've got it fixed.

Here's the updated version of my blacklists.

spoiler
Oh! Thanks for sharing both blacklists. I'm going to compare the two and see if anything you changed stands out as to what might have caused the problem. Maybe I'll get some sort of lead.

Changer said:
An extremely long list might take so long to process that the processing just fails.
I hadn't considered that. I wonder if the process is timing out due to too many populated tags being blacklisted. Interesting thought, and something I'll have to look into.

Random Related Thoughts
Vanndril said:
I hadn't considered that. I wonder if the process is timing out due to too many populated tags being blacklisted. Interesting thought, and something I'll have to look into.

Random Related Thoughts
I don't know how it's coded, but I would imagine it would have to, in some way, search each image for every tag on it, then compare those tags to the blacklist for matches. If we assume an average of 20 tags per image; and I believe there are 30 images per page, having a blacklist as long as Waverun's original list which is apparently 189 tags long, you end up with 1,134,000 comparisons being made. That's a lot to process. (incidentally, the new blacklist is 184 long, which ends up with about 30,000 fewer comparisons.)
ZeldaIsHot said:
how the fuck did it get so big i mean holy shit
I realize this is a very large issue we're discussing here but er...

"That's what she said"

I'll go back to lurking now
Changer said:
I don't know how it's coded, but I would imagine it would have to, in some way, search each image for every tag on it, then compare those tags to the blacklist for matches. If we assume an average of 20 tags per image; and I believe there are 30 images per page, having a blacklist as long as Waverun's original list which is apparently 189 tags long, you end up with 1,134,000 comparisons being made. That's a lot to process. (incidentally, the new blacklist is 184 long, which ends up with about 30,000 fewer comparisons.)
Possibly. But wouldn't that logically imply a tag count cap for blacklists before running into this issue? The number of tags doesn't matter. I can find a blacklist that doesn't work, then make a new blacklist with the same number of different tags and have the new one work fine.
Vanndril said:
Possibly. But wouldn't that logically imply a tag count cap for blacklists before running into this issue? The number of tags doesn't matter. I can find a blacklist that doesn't work, then make a new blacklist with the same number of different tags and have the new one work fine.
I think it would be sort of a soft cap; as at any given time there could be more or less tags being searched on a single page. Just to test; I singled out the tags that were different and added them to my blacklist and it continued to work; so I tried adding repeat tags since I noticed several of those, still worked, and then I tried making up non-tags since there are a lot of tags in the list that don't exist. It still worked after that too.

So, I just pasted the entire original blacklist into my blacklist, and it continued to work; then I removed my blacklist and just had the original blacklist and it still worked after that too. Then I decided to paste the entire list twice. and it still worked.

Based on that, I'm thinking in addition to it seeming to affect longer lists; the bug must depend on some factors outside the blacklist as well; either level of traffic the site is having at the time, number of total tags on the page being viewed, or browser settings if any calculations are being done browser-side at all.
You can probably get rid of all the mustache tags, since the only ones left are the sonic boom one which you have blocked using other tags.
have you considered...condensing? I mean, "Entrancement_UK" should be covered by "Real", and "Beauty_and_the_beast", "Kaa", "the_lion_king" and a few others are all "Disney". Not to mention 3 different Ed, Edd and Eddy tags...
myrmidon said:
have you considered...condensing? I mean, "Entrancement_UK" should be covered by "Real", and "Beauty_and_the_beast", "Kaa", "the_lion_king" and a few others are all "Disney". Not to mention 3 different Ed, Edd and Eddy tags...
The problem with condensing tags is a lack of specificity. For example, The Lion King is usually considered a Disney work, but there are also a lot of other Disney works. If we condensed all Disney works into the Disney tag, and a user very specifically wanted fanart of The Lion King, they'd be unable to search for that.

There may be some minor condensing that can be done, but nothing that I can think of would be drastic enough to fix this sort of problem, assuming that the blacklist error is caused by the density of our tag list.

Changer said:
I think it would be sort of a soft cap; as at any given time there could be more or less tags being searched on a single page. Just to test; I singled out the tags that were different and added them to my blacklist and it continued to work; so I tried adding repeat tags since I noticed several of those, still worked, and then I tried making up non-tags since there are a lot of tags in the list that don't exist. It still worked after that too.

So, I just pasted the entire original blacklist into my blacklist, and it continued to work; then I removed my blacklist and just had the original blacklist and it still worked after that too. Then I decided to paste the entire list twice. and it still worked.

Based on that, I'm thinking in addition to it seeming to affect longer lists; the bug must depend on some factors outside the blacklist as well; either level of traffic the site is having at the time, number of total tags on the page being viewed, or browser settings if any calculations are being done browser-side at all.
This is...interesting. You used the very same blacklist as waverun posted in the opening post of this thread? And it worked for you?

That would certainly suggest that something outside of the blacklists themselves is causing the problem.

You may be onto something.
This thread is making me want a cyber-detective show with Vanndril and Changer.
Vanndril said:
For example, The Lion King is usually considered a Disney work, but there are also a lot of other Disney works. If we condensed all Disney works into the Disney tag, and a user very specifically wanted fanart of The Lion King, they'd be unable to search for that.
I'm not talking about you condensing, I'm talking about OP condensing.

if you have all the Disney stuff blacklisted individually why not just save some effort and just blacklist "Disney"?
Vanndril said:
The problem with condensing tags is a lack of specificity. For example, The Lion King is usually considered a Disney work, but there are also a lot of other Disney works. If we condensed all Disney works into the Disney tag, and a user very specifically wanted fanart of The Lion King, they'd be unable to search for that.

There may be some minor condensing that can be done, but nothing that I can think of would be drastic enough to fix this sort of problem, assuming that the blacklist error is caused by the density of our tag list.
I believe they meant to address Waverun; since they already had Disney blacklisted, anything that is a Disney work would automatically be blacklisted and so they would not need to specifically blacklist specific Disney works.
Vanndril said:
This is...interesting. You used the very same blacklist as waverun posted in the opening post of this thread? And it worked for you?

That would certainly suggest that something outside of the blacklists themselves is causing the problem.

You may be onto something.
Yeah, I used the list in the original post. It was a different day though; so the posts on the front page had changed by then, and the amount of traffic the site had at the time was probably different as well.

Pinkanator said:
This thread is making me want a cyber-detective show with Vanndril and Changer.
"They think they can hide by using a few proxies? They underestimate me. I'll have their IP nailed down within the hour then I'll be able to determine exactly where they are coming from then I'll take a potato chip AND EAT IT!"
Anyone else think that big blacklists would be less of a problem with more attentive tags? I see a lot of stuff fly by, or good stuff mistagged. We should probably fine some guidelines to go by for frequently-blacklisted tags so there's no question of "Yes or no to this tag."
There's definitely some non-deterministic stuff going on. I can randomly fix and re-break the above blacklist by deleting a dozen lines, saving, re-adding, and saving again. Size may be a factor but (190 lines of mostly one tag) * (30 posts) is like 6k string comparisons, and those are only of length (tags on an image); string comparison is expensive but it's not that expensive. In other words if size is a factor, it's not because of the run-time characteristics; it's because somewhere along the way the data goes bad because it was never designed for blacklists that long. It seems to be stored in the database fine because when you refresh your user options the whole list is still there, but whatever pulls it out and parses it for actual use might not be able to handle it.

EDIT: By the way, I could never find a case where the first line fails to work. I tested all kinds of different orders including alphabetizing the whole list by line. It seems that there are two states for a blacklist, "broken" and "not broken", and in the "broken" state only the first line of the blacklist will apply.
Why would you blacklist Uncle Sam? What kind of American are you ;P?
Dantus said:
Why would you blacklist Uncle Sam? What kind of American are you ;P?
Do we know if Waverun is American? You don't know! He could be British! You don't know! You don't know! Feel the unrellenting pain of not knowing! Mwahahahaha!!!
myrmidon said:
I'm not talking about you condensing, I'm talking about OP condensing.

if you have all the Disney stuff blacklisted individually why not just save some effort and just blacklist "Disney"?
Oh. Derp.

My bad.

greasyi said:
There's definitely some non-deterministic stuff going on. [...] In other words if size is a factor, it's not because of the run-time characteristics; it's because somewhere along the way the data goes bad because it was never designed for blacklists that long. It seems to be stored in the database fine because when you refresh your user options the whole list is still there, but whatever pulls it out and parses it for actual use might not be able to handle it.
So, the problem likely lies between the database and the function(s) that apply the blacklist, but neither the storage of the blacklist in the database itself nor the application of the blacklist are the cause? Hm... Yeah, if that's the case, this sort of thing is way over my head to fix.

I can confirm beyond the shadow of a doubt that the blacklists are stored in the database fine, btw. That's the first thing I checked, and I looked in the database itself.

greasyi said:
By the way, I could never find a case where the first line fails to work.
Another interesting fact, here. I have no idea what to make of it, though. @_@

Yeah. The scope of this problem is certainly beyond me.

Pinkanator said:
This thread is making me want a cyber-detective show with Vanndril and Changer.
Pft. That's kind of a fun idea to toy with. While Changer and I bumble about using pseudo-science terms that a lot of detective/cop shows use, Greasyi could pop in and be the only guy who ACTUALLY knows what he's talking about and uses real terminology that makes sense in context.

Looking at you, every CSI under the Sun.