Choosing a tool in Watson Studio
Watson Studio provides a range of tools for users with all levels of experience in preparing, analyzing, and modeling data, from beginner to expert.
To pick the right tool, consider these factors.
- The type of data you have
-
- Tabular data in delimited files or relational data in remote data sources
- Image files
- Textual data in documents
- The type of tasks you need to do
-
- Prepare data: cleanse, shape, visualize, organize, and validate data.
- Analyze data: identify patterns and relationships in data, and display insights.
- Build models: build, train, test, and deploy models to classify data, make predictions, or optimize decisions.
- How much automation you want
-
- Code editor tools: Use to write code in Python, R, or Scala.
- Graphical canvas tools: Use menus and drag-and-drop functionality on a canvas to visually program.
- Automated tools: Use to configure automated tasks that require limited user input.
Find the right tool:
Tools for tabular or relational data
Tools for tabular or relational data by task:
Tool | Tool type | Prepare data | Analyze data | Build models |
---|---|---|---|---|
Jupyter notebook editor | Code editor | ✓ | ✓ | ✓ |
RStudio | Code editor | ✓ | ✓ | ✓ |
Data Refinery | Graphical canvas | ✓ | ✓ |  |
DataStage | Graphical canvas | ✓ |  |  |
Streams flow editor | Graphical canvas | ✓ | ✓ |  |
Dashboard editor | Graphical canvas |  | ✓ |  |
SPSS Modeler | Graphical canvas | ✓ | ✓ | ✓ |
Decision Optimization model builder | Graphical canvas and code editor | ✓ |  | ✓ |
AutoAI | Automated tool | ✓ |  | ✓ |
Metadata import | Automated tool | ✓ |  |  |
Tools for textual data
Tools for building a model that classifies textual data:
Tool | Code editor | Graphical canvas | Automated tool |
---|---|---|---|
Jupyter notebook editor | ✓ |  |  |
RStudio | ✓ |  |  |
SPSS Modeler |  | ✓ |  |
Experiment builder |  | ✓ |  |
Natural Language Classifier modeler |  |  | ✓ |
Tools for image data
Tools for building a model that classifies images:
Tool | Code editor | Graphical canvas | Automated tool |
---|---|---|---|
Jupyter notebook editor | ✓ |  |  |
RStudio | ✓ |  |  |
Experiment builder |  | ✓ |  |
Visual Recognition modeler |  |  | ✓ |
Accessing tools
To use a tool, you must create an asset specific to that tool, or open an existing asset for that tool.
To create an asset, click Add to project and then choose the asset type you want.
This table shows the asset type to choose for each tool.
To use this tool | Choose this asset type |
---|---|
Jupyter notebook editor | Jupyter notebook |
Data Refinery | Data Refinery flow |
DataStage | DataStage flow |
Streams flow editor | Streams flow |
Dashboard editor | Dashboard |
SPSS Modeler | Modeler flow |
Decision Optimization model builder | Decision Optimization |
AutoAI | AutoAI experiment |
Experiment builder | Experiment |
Visual Recognition modeler | Visual Recognition model |
Natural Language Classifier modeler | Natural Language Classifier model |
Metadata import | Metadata import |
To edit notebooks with RStudio, click Launch IDE > RStudio.
Jupyter notebook editor
Use the Jupyter notebook editor to create a notebook in which you run code to prepare, visualize, and analyze data, or build and train a model.
- Data format
- Any
- Data size
- Any
- How you can prepare data, analyze data, or build models
- Write code in Python, R, or Scala
Include rich text and media with your code
Work with any kind of data in any way you want
Use preinstalled or install other open source and IBM libraries and packages
Schedule runs of your code Import a notebook from a file, a URL, or the Gallery Share read-only copies of your notebook externally - Get started
- To create a notebook, click Add to project > Notebook.
- Learn more
- Load and analyze public data sets video
Videos about notebooks
Sample notebooks
Documentation about notebooks
Data Refinery
Use Data Refinery to prepare and visualize tabular data with a graphical flow editor. You create and then run a Data Refinery flow as a set of ordered operations on data.
- Data format
- Tabular: Avro, CSV, JSON, Parquet, TSV (read only), or delimited text files
Relational: Tables in relational data sources - Data size
- Any
- How you can prepare data
- Cleanse, shape, organize data with over 60 operations
Save refined data as a new data set or update the original data
Profile data to validate it
Use interactive templates to manipulate data with code operations, functions, and logical operators
Schedule recurring operations on data - How you can analyze data
- Identify patterns, connections, and relationships within the data in multiple visualization charts
- Get started
- To create a Data Refinery flow, click Add to project > Data Refinery flow.
- Learn more
- Videos about Data Refinery
Shape Data video
Documentation about Data Refinery
DataStage
Use DataStage to prepare and visualize tabular data with a graphical flow editor. You create and then run a DataStage flow as a set of ordered operations on data.
- Data format
- Tabular: Avro, CSV, JSON, Parquet, TSV (read only), or delimited text files
Relational: Tables in relational data sources - Data size
- Any
- How you can prepare data
- Design a graphical data integration flow that generates Orchestrate code to run on the high performing, DataStage parallel engine.
- Perform operations such as: Join, Funnel, Checksum, Merge, Modify, Remove Duplicates, and Sort.
- Get started
- To create a DataStage flow, click Add to project > DataStage flow.
- Learn more
- DataStage documentation
Streams flow editor
Use the streams flow editor to access and analyze streaming data. You can create a streams flow with a wizard or with a flow editor on a graphical canvas.
- Required service
- Streaming Analytics
- Data format
- Streaming data as JSON messages
Streaming binary data - Data size
- Any
- How you can prepare data
- Ingest streaming data
Aggregate, filter, and process streaming data
Process streaming data for a model - How you can analyze data
- Run real-time analytics on streaming data
- Get started
- To create a streams flow, click Add to project > Streams flow.
- Learn more
- Streams flow Overview video
Videos about streams flows
Documentation about streams flows
Dashboard editor
Use the Dashboard editor to create a set of visualizations of analytical results on a graphical canvas.
- Required service
- Cognos Dashboard Embedded
- Data format
- Tabular: CSV files
Relational: Tables in some relational data sources - Data size
- Any size
- How you can analyze data
- Create graphs without coding
Include text, media, web pages, images, and shapes in your dashboard
Share interactive dashboards externally - Get started
- To create a dashboard, click Add to project > Dashboard.
- Learn more
SPSS Modeler
Use SPSS Modeler to create a flow to prepare data and build and train a model with a flow editor on a graphical canvas.
- Data formats
- Relational: Tables in relational data sources
Tabular: Excel files (.xls or .xlsx), CSV files, or SPSS Statistics files (.sav)
Textual: In the supported relational tables or files - Data size
- Any
- How you can prepare data
- Use automatic data preparation functions
Write SQL statements to manipulate data
Cleanse, shape, sample, sort, and derive data - How you can analyze data
- Visualize data with over 40 graphs
Identify the natural language of a text field - How you can build models
- Build predictive models
Choose from over 40 modeling algorithms
Use automatic modeling functions
Model time series or geospatial data
Classify textual data
Identify relationships between the concepts in textual data - Get started
- To create an SPSS Modeler flow, click Add to project > Modeler flow and then choose IBM SPSS Modeler.
- Learn more
- SPSS Modeler - refreshed UI for an enterprise data science powerhouse video
Documentation about SPSS Modeler
Decision Optimization model builder
Use Decision Optimization to build and run optimization models in the Decision Optimization modeler or in a Jupyter notebook.
- Data formats
- Tabular: CSV files
- Data size
- Any
- How you can prepare data
- Import relevant data into a scenario and edit it.
- How you can build models
- Build prescriptive decision optimization models.
Create, import and edit models in Python DOcplex, OPL or with natural language expressions.
Create, import and edit models in notebooks. - How you can solve models
- Run and solve decision optimization models using CPLEX engines.
Investigate and compare solutions for multiple scenarios.
Create tables, charts and notes to visualize data and solutions for one or more scenarios. - Get started
- To create a Decision Optimization model, click Add to project > Decision Optimization, or for notebooks click Add to project > Notebook.
- Learn more
- Introduction to Decision Optimization for Watson Studio
Decision Optimization videos
Documentation about Decision Optimization
AutoAI tool
Use the AutoAI tool to automatically analyze your tabular data and generate candidate model pipelines customized for your predictive modeling problem.
- Required service
- Watson Machine Learning
- Data format
- Tabular: CSV files
- Data size
- Less than 1 GB
- How you can prepare data
- Automatically transform data, such as impute missing values
- How you can build models
- Train a binary classification, multiclass classification, or regression model
View a tree infographic that shows the sequences of AutoAI training stages
Generate a leaderboard of model pipelines ranked by cross-validation scores
Save a pipeline as a model - Get started
- To create an AutoAI experiment, click Add to project > AutoAI experiment.
- Learn more
- Documentation about AutoAI
Experiment builder
Use the Experiment builder to build deep learning experiments and run hundreds of training runs. This method requires that you provide code to define the training run. You run, track, store, and compare the results in the Experiment Builder graphical interface, then save the best configuration as a model.
- Data format
- Textual: CSV files with labeled textual data
Image: Image files in a PKL file. For example, a model testing signatures uses images resized to 32×32 pixels and stored as numpy arrays in a pickled format. - Data size
- Any size
- How you can build models
- Write Python code to specify metrics for training runs
Write a training definition in Python code
Define hyperparameters, or choose the RBFOpt method or random hyperparameter settings
Find the optimal values for large numbers of hyperparameters by running hundreds or thousands of training runs
Run distributed training with GPUs and specialized, powerful hardware and infrastructure
Compare the performance of training runs
Save a training run as a model - Get started
- To create an experiment, click Add to project > Experiment.
- Learn more
- Neural Network Modeler and Deep Learning Experiments on Watson Studio video
Documentation about Experiment builder
Visual Recognition modeler
Use the Visual Recognition modeler to automatically train a model to classify images for scenes, objects, and other content.
- Required service
- Visual Recognition
- Data format
- Image: JPEG or PNG files in a .zip file, separated by class
- Data size
- Small to medium data sets
- How you can build models
- Collaborate to classify images
Use a built-in models
Test the model with sample images
Use CoreML to develop iOS apps
Provide as few as 10 images per class
Add or remove images to retrain the model
Use Watson Visual Recognition APIs in applications - Get started
- To create a Visual Recognition model, click Add to project > Visual Recognition model.
- Learn more
- Get Started With Visual Recognition video
Videos about Visual Recognition
Documentation about Visual Recognition
Natural Language Classifier modeler
Use the Natural Language Classifier modeler to automatically train a model to classify text according to classes you define.
- Required service
- Natural Language Classifier
- Data format
- Textual: CSV files with sample text and class names
- Data size
- Small to medium data sets
- How you can build models
- Provide as few as 3 text samples per class
Collaborate to classify text samples
Test the model with sample text
Add or remove test data to retrain the model
Classify text in eight languages other than English
Use Watson Natural Language Classifier APIs in applications - Get started
- To create a Natural Language Classifier model, click Add to project > Natural Language Classifier model.
- Learn more
- Documentation about Natural Language Classifier modeler
- Videos about Natural Language Classifier modeler
Metadata import
Use the metadata import tool to automatically discover and import technical and process metadata for data assets into a project or a catalog.
- Required service
- Watson Knowledge Catalog
- Data format
- Relational: Tables in relational data sources
- Data size
- Any size
- How you can prepare data
- Import data assets from a connection to a data source
- Get started
- To import metadata, click Add to project > Metadata import.
- Learn more
- Documentation about metadata import
- Videos about Watson Knowledge Catalog
RStudio IDE
Use RStudio IDE to analyze data or create Shiny applications by writing R code. RStudio can be integrated with a Git repository which must be associated with the project.
- Data format
- Any
- Data size
- Any size
- How you can prepare data, analyze data, and build models
- Write code in R
Create Shiny apps
Use open source libraries and packages
Include rich text and media with your code
Prepare data
Visualize data
Discover insights from data
Build and train a model using open source libraries Share your Shiny app in a Git repository - Get started
- To use RStudio, click Launch IDE > RStudio.
- Learn more
- Overview of RStudio IDE video
Videos about RStudio
Documentation about RStudio