? Guest Post: How to build a responsible code LLM with crowdsourcing*
TheSequence
MAY 29, 2023
If you use scraped data, the model might pick up some private information, amplify existing biases, and, consequently, create more harm than good. Instead of giving the Toloka crowd (also known as Tolokers) an assignment to label every type of PII in code, we grouped PII into 7 categories and set up a separate labeling project for each.
Let's personalize your content