Annotating data with the DefinedCrowd platform

You can use the DefinedCrowd platform to create an annotation job within Watson Studio. See Annotate data for the plan and role requirements for creating text annotation or image labeling jobs. You must have a DefinedCrowd enterprise account to integrate with Watson Studio.

In DefinedCrowd, an annotation job is called a project. A project receives input data units and delivers output data units. You choose the type of data you want to obtain by selecting one of the project templates and configuring the template according to your needs.

For additional quality control, you can enable tests in the project to evaluate the quality of the contributor. For some languages, you can specify to ensure that contributors pass a proficiency test for that language.

See the DefinedCrowd documentation for more information about quality control and projects. You need a DefinedCrowd enterprise account to view the documentation.

DefinedCrowd job annotation requirements within Watson Studio:

  • You must have an enterprise account with DefinedCrowd.
  • You can configure annotation jobs for sentiment analysis or image tagging.

Set up integration between Watson Studio and DefinedCrowd

You must have the Admin role in the project to set up integration.

To set up integration with DefinedCrowd:

  1. From your project’s Settings page, in the Integrations section, create an account with DefinedCrowd.
  2. In your DefinedCrowd account, create an access key:
    1. Go to Developers -> API Access Keys.
    2. Click Create access key.
    3. Select Full Access for Access key permissions.
    4. Click Create. Your Access Key ID and Access Key Secret are displayed. This is the only time you’ll see your Access Key Secret. If you lose it, create a new access key.
  3. Back in Watson Studio, add your Access Key ID and Access Key Secret in the Integrations section.

Create an annotation job

You must have the Admin or Editor role in the project to create an annotation job.

DefinedCrowd job text annotation requirements within Watson Studio:

  • For text annotation you can annotate only data assets from CSV files.
  • The CSV file for a text annotation job cannot exceed 10K rows. If you have a larger data set, break it into multiple jobs.
  • The text to annotate must be in a single column.

To create a text annotation job:

  1. On the project’s Assets page preview the data asset to annotate.
  2. Click Annotate, select DefinedCrowd as your provider, then click Next to configure your annotation job.
  3. Enter an annotation project name for your job, then select the Text Sentiment Annotation project template.
  4. Select your language and country preferences. If available, you can select to require language proficiency.
  5. Specify one column from your data set as the content to annotate. Only the data from this column is sent to DefinedCrowd.
  6. Select how many contributor responses to receive per row.
  7. Check that the data on the Summary pane matches what you configured for your project job and click Submit to send the project job for annotation.

DefinedCrowd job image annotation requirements within Watson Studio:

  • Supported formats for image files are: JPEG, PNG, WebP, TIFF, GIF and SVG.
  • Image files must be packaged in a ZIP file for upload.
  • You must define at least one but not more than 10 tagging categories.
  • A category can have no more than 5 subcategories.
  • Each category or subcategory must have a description and at least 1 but no more than 5 examples.

To create an image annotation job:

  1. On the project’s Assets page preview the data asset to annotate.
  2. Click Annotate, select DefinedCrowd as your provider, then click Next to configure your annotation job.
  3. Enter an annotation project name for your job, then select the Image Tagging project template.
  4. Select how many contributor responses to receive per row.
  5. Define the categories for your tagging project. For each category supply a description and some tagging examples, separated by commas. For example, if you are tagging animals, the examples might be dog, cat, pig, and so on.
  6. Optionally add subcategories to further refine the tagging. For example, a subcategory of animals could be types of dogs, such as beagle, corgi, and dachsund.
  7. Check that the data on the Summary pane matches what you configured for your project job and click Submit to send the project job for annotation.

In the information pane on the asset’s Preview page in Watson Studio, you can see the annotation job and its status. You can also click the link to open the DefinedCrowd monitoring dashboard. The dashboard displays information such as the job progress, annotation statistics, sentiment distribution, and demographic information about contributors.

When the annotation job is complete you will receive a notification containing a link to the new data asset with the results. You can also view the result files for the completed jobs in the information pane on the asset’s preview page The annotated CSV file for a text annotation contains the original columns plus the annotations, in the sentiments column, and an ID column. The file returned for an image annotation job contains a url for each image, information about the bounding coordinates for the image, and the annotation information for each image.

Manage annotation jobs

You can add funds to your DefinedCrowd account by clicking the link in the Integrations section on the Settings page.