yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Moderating content with logical operators | Intro to CS - Python | Khan Academy


4m read
·Nov 10, 2024

Let's design a program with compound Boolean expressions.

We're working on an automated content moderation system for our site. We want our system to automatically flag posts that seem questionable so our team can investigate further and decide which ones to take down. We also want to automatically promote any posts that we think are particularly useful so that they appear at the top of the page.

Let's think about what our algorithm for flagging posts might look like. Our moderators have told us that they're especially wary of new accounts. It's not often that users who have been on our site for a while all of a sudden start posting a ton of spam. There are, of course, legitimate new users, and we don't want to discourage them from using our site by flagging their posts all the time. So we think the best thing to do might be to check for a combination of conditions we have available.

The overall sentiment of the post, whether it's positive, neutral, or negative. Here, we're most worried about negative posts. We definitely don't want users bullying each other or promoting violence on our site.

All right, now what about a better algorithm for promoting useful posts? We want to make sure not to promote posts that are negative, but we want to be fair and not just promote posts that talk about how great our site is. So we'll consider both positive and neutral posts.

Then we decided we also want to value consistent users: people who have been on our site for a while and represent trusted voices. We recognize this isn't a perfect algorithm, but we think it might be good enough for what we're trying to do here.

Next step: let's translate our content moderation algorithms into code. We start with our two pieces of data: the sentiment of the post and the user's account age in days. Our algorithm for flagging was: if the sentiment is negative and the account is new. Let's say new is less than a week old. That means our condition is: the sentiment equals equals negative, and the account age in days is less than seven. It's an AND because we only want to flag if both conditions are true.

We surround our compound condition with an if statement, and then inside the if statement, we want to print our content moderation decision. That lets our moderators know to take a closer look at this post.

Okay, now let's test this on a couple of different posts. A negative post created by a new user should get this flag. A negative post created by a longtime user shouldn't, and a positive post created by a new user shouldn't either.

Let's work on our post promotion algorithm next. This algorithm was for neutral or positive posts from trusted accounts. Because there's only three possible sentiments—positive, neutral, and negative—this is equivalent to saying sentiment is not equal to negative for trusted users. Let's go with an account age that's greater than or equal to 30 days. That's not the perfect equivalent, but we think it'll provide a good approximation.

So we add our if statement, and then inside the if statement, we just want to signal that the post has been featured. So we indent a print function call inside the statement. Then let's run through a few test cases to make sure this is working as intended: a negative post by a trusted user, a positive post by a new user, and a neutral post by a trusted user.

We have it working, but we want to refine our algorithm a bit. We are noticing that some not super useful posts are getting featured, like that post we had at the beginning that just said "hi." We think we can make a good generalization here that posts that are super short or super long are probably not the most useful. So let's add another condition to our feature case.

To do this, we need a new piece of information about each post: we need to know how many words it has. Our team says it should be easy to get this data, so we'll add a new variable: word count. This is about to make our conditions super long, so I'm going to break it up into multiple pieces to make it easier to read. Post lengths that we don't like are less than or equal to 3 words or greater than 200 words.

We use an OR here because it's suspicious if either condition is true. We store that intermediate result in a variable, and then we add it on to our feature condition. We use an AND here because we want all three of these conditions to be true in order to feature the post. However, we don't want to feature it if it's a suspicious length; we want to feature it if it's not a suspicious length. So we use the NOT operator here.

Now let's check our condition with that post that said "hi." It had a word count of one, a neutral sentiment, and a pretty old account. Great! Now that post is no longer being featured. However, I see I'm getting a lint error now where my line is too long. To fix this, I'm going to break my condition up into multiple variables.

Let's say a useful post is not negative and not a suspicious length, and then let's store the result of the account age check in a variable called is trusted user. Then our condition just becomes: if is useful post and is trusted user, which is a lot easier to understand at a glance. In fact, it's so readable that it's self-documenting such that we don't even really need this comment anymore because it just says the same thing as the variable names.

We'll go on and test with a few more cases to make sure everything works, and then we'll make sure to monitor how our algorithm performs on our site so we can keep making adjustments as needed.

More Articles

View All
The Entire History of The Universe in 10 Minutes
The entire universe, every electron, proton, atom, every star and galaxy, was born out of a singularity that brought about our whole existence: the Big Bang. An isolated moment in space and time created something out of nothing. We didn’t know much about …
World's First Electric Generator
[Applause] I have a pipe. Yeah, do you want to hold it? Do you know what it’s made of? Metal. Is it brass? Copper? Coer? Is copper magnetic? No? Uh-oh. I’m going to go. No, I didn’t think it was. Go, yes! I’m going to go. Yes! Well, why don’t we check? Th…
Poor Visibility and Cold Fingers | Life Below Zero
With her loader on its way to Kavik, Sue attempted to meet the convoy to guide them to camp safely. However, dangerous conditions forced her to return home. Checking on the status and safety of the delivery crew is a priority. “Hack, a cold! I mean, comi…
Animal behavior and offspring success | Middle school biology | Khan Academy
Let’s talk a little bit about reproductive success, which is related to the number of surviving offspring that an animal has during its lifetime. An animal that has more surviving offspring has a higher reproductive success. Now, there’s two broad categ…
Astronaut Urine Teaches Us Stuff - Smarter Every Day 149
Hey, it’s me Destin, welcome back to Smarter Every Day. When I make a video, I normally ask a question and have a pretty good idea of where that question is going to take me. This one is way different. We’re going to start in a weight room and we are goin…
Super hot tension | Forces and Newton's laws of motion | Physics | Khan Academy
Oh, it’s time! It’s time for the super hot tension problem. We’re about to do this right here. We’ve got our super hot can of red peppers hanging from these strings. We want to know what the tension is in these ropes. This is for real now; this is a real …