yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Moderating content with logical operators | Intro to CS - Python | Khan Academy


4m read
·Nov 10, 2024

Let's design a program with compound Boolean expressions.

We're working on an automated content moderation system for our site. We want our system to automatically flag posts that seem questionable so our team can investigate further and decide which ones to take down. We also want to automatically promote any posts that we think are particularly useful so that they appear at the top of the page.

Let's think about what our algorithm for flagging posts might look like. Our moderators have told us that they're especially wary of new accounts. It's not often that users who have been on our site for a while all of a sudden start posting a ton of spam. There are, of course, legitimate new users, and we don't want to discourage them from using our site by flagging their posts all the time. So we think the best thing to do might be to check for a combination of conditions we have available.

The overall sentiment of the post, whether it's positive, neutral, or negative. Here, we're most worried about negative posts. We definitely don't want users bullying each other or promoting violence on our site.

All right, now what about a better algorithm for promoting useful posts? We want to make sure not to promote posts that are negative, but we want to be fair and not just promote posts that talk about how great our site is. So we'll consider both positive and neutral posts.

Then we decided we also want to value consistent users: people who have been on our site for a while and represent trusted voices. We recognize this isn't a perfect algorithm, but we think it might be good enough for what we're trying to do here.

Next step: let's translate our content moderation algorithms into code. We start with our two pieces of data: the sentiment of the post and the user's account age in days. Our algorithm for flagging was: if the sentiment is negative and the account is new. Let's say new is less than a week old. That means our condition is: the sentiment equals equals negative, and the account age in days is less than seven. It's an AND because we only want to flag if both conditions are true.

We surround our compound condition with an if statement, and then inside the if statement, we want to print our content moderation decision. That lets our moderators know to take a closer look at this post.

Okay, now let's test this on a couple of different posts. A negative post created by a new user should get this flag. A negative post created by a longtime user shouldn't, and a positive post created by a new user shouldn't either.

Let's work on our post promotion algorithm next. This algorithm was for neutral or positive posts from trusted accounts. Because there's only three possible sentiments—positive, neutral, and negative—this is equivalent to saying sentiment is not equal to negative for trusted users. Let's go with an account age that's greater than or equal to 30 days. That's not the perfect equivalent, but we think it'll provide a good approximation.

So we add our if statement, and then inside the if statement, we just want to signal that the post has been featured. So we indent a print function call inside the statement. Then let's run through a few test cases to make sure this is working as intended: a negative post by a trusted user, a positive post by a new user, and a neutral post by a trusted user.

We have it working, but we want to refine our algorithm a bit. We are noticing that some not super useful posts are getting featured, like that post we had at the beginning that just said "hi." We think we can make a good generalization here that posts that are super short or super long are probably not the most useful. So let's add another condition to our feature case.

To do this, we need a new piece of information about each post: we need to know how many words it has. Our team says it should be easy to get this data, so we'll add a new variable: word count. This is about to make our conditions super long, so I'm going to break it up into multiple pieces to make it easier to read. Post lengths that we don't like are less than or equal to 3 words or greater than 200 words.

We use an OR here because it's suspicious if either condition is true. We store that intermediate result in a variable, and then we add it on to our feature condition. We use an AND here because we want all three of these conditions to be true in order to feature the post. However, we don't want to feature it if it's a suspicious length; we want to feature it if it's not a suspicious length. So we use the NOT operator here.

Now let's check our condition with that post that said "hi." It had a word count of one, a neutral sentiment, and a pretty old account. Great! Now that post is no longer being featured. However, I see I'm getting a lint error now where my line is too long. To fix this, I'm going to break my condition up into multiple variables.

Let's say a useful post is not negative and not a suspicious length, and then let's store the result of the account age check in a variable called is trusted user. Then our condition just becomes: if is useful post and is trusted user, which is a lot easier to understand at a glance. In fact, it's so readable that it's self-documenting such that we don't even really need this comment anymore because it just says the same thing as the variable names.

We'll go on and test with a few more cases to make sure everything works, and then we'll make sure to monitor how our algorithm performs on our site so we can keep making adjustments as needed.

More Articles

View All
Taxing Unrealized Values Can Destroy Billionaires
Most people don’t realize that this can actually make Warren Buffett and Jeff Bezos go broke and send their stocks crashing. The reason is because 48 trillion dollars of stock value equals zero dollars in real money, and the IRS only takes real money. Bi…
Interpreting direction of motion from position-time graph | AP Calculus AB | Khan Academy
An object is moving along a line. The following graph gives the object’s position relative to its starting point over time. For each point on the graph, is the object moving forward, backward, or neither? So pause this video and try to figure that out. A…
Charlie Munger – The Man Who Built Berkshire Hathaway | A Documentary
[Music] America looked at capitalism as a failed experiment. This is the example of the time when capitalism broke. There was a terrible deflation, a shortage of money so little money that people made their own monopoly money, their own script. It was so …
Who Is God? | Street Spirituality
Idea or doctor doesn’t good accident. Okay, hi! Which article book Islam me for such on fixing? What would a really quick way to know is mission individual people just want something, or some person to have faith in so that they can survive? And that’s w…
Using right triangle ratios to approximate angle measure | High school geometry | Khan Academy
We’re told here are the approximate ratios for angle measures: 25 degrees, 35 degrees, and 45 degrees. So, what they’re saying here is if you were to take the adjacent leg length over the hypotenuse leg length for a 25-degree angle, it would be a ratio o…
The 10 BEST Side Hustles - How I Make $10,000 / Month
What’s up, Graham? It’s guys here. So, I’ll be honest; not only have I seen every single side hustle video imaginable on YouTube, but I’ve also tried nearly every single one of those options throughout the last 12 years. And let me tell you, even though …