What’s new

Check out the new features for Cloud Pak for Data as a Service and the core services of Watson Studio, Watson Machine Learning, and Watson Knowledge Catalog each week.

Week of 26 October 2020

Master Data Management service beta

The Master Data Management service is now in beta. This new Master Data Management experience seamlessly consolidates data from the disparate sources that fuel your business to establish a single, trusted, 360-degree view of your customers.

The beta release of Master Data Management includes two user experiences:

  • Master data configuration for data engineers to prepare and configure master data. This experience enables you to:
    • Configure the master data system.
    • Refine the generated data model.
    • Upload or connect data assets and sources.
    • Map data into the model.
    • Run the Master Data Management service’s powerful matching capability to create master data entities.
    • Configuring and tuning the matching algorithm to meet your organization’s requirements.
  • Master data explorer for business analysts and data stewards to search, view, analyze, and export master data entities.

The Master Data Management service on Cloud Pak for Data as a Service also includes a rich set of APIs that empower your business applications with direct access to trusted data.

For more information about Master Data Management, see Managing master data (Beta).

Week ending 23 October 2020

New way of adding data

Create metadata import assets to configure and run the metadata import for a selected set of data assets into a project or a catalog. For details, see Managing metadata imports.

SJIS encoding available in Data Refinery for input and output

SJIS (“Shift JIS” or Shift Japanese Industrial Standards) encoding is an encoding for the Japanese language.

To change the encoding of the input file, open the file in Data Refinery, go to the Data tab, scroll down to the SOURCE FILE information, and then click the “Specify data format” icon Specify data format icon.

Specify data format

To change the encoding of the output (target) file, open the Information pane Info icon and click the Details tab. Click the Edit button. In the DATA REFINERY FLOW OUTPUT pane, click the Edit icon to change the encoding.

Output file encoding

The SJIS  encoding is supported only for CSV and delimited files.

New visualization charts for Data Refinery and SPSS Modeler

To access the charts in Data Refinery, click the Visualizations tab and then select the columns to visualize. The chart automatically updates as you refine the data. 

To access the charts in SPSS Modeler, use a Charts node. The Charts node is available under the Graphs section on the node palette. Double-click the Charts node to open the properties pane. Then click Launch Chart Builder to create one or more chart definitions to associate with the node.

For the full list of available charts, see Visualizing your data.

  • Evaluation charts are combination charts that measure the quality of a binary classifier. You need three columns for input: actual (target) value, predict value, and confidence (0 or 1). Move the slider in the Cutoff chart to dynamically update the other charts. The ROC and other charts are standard measurements of the classifier.

    Evaluation chart

  • Math curve charts display a group of curves based on equations that you enter. You do not use a data set with this chart. Instead, you use it to compare the results with the data set in another chart, like the scatter plot chart.

    Math curve chart

  • Sunburst charts display different depths of hierarchical groups. The Sunburst chart was formerly an option in the Treemap chart.

    Sunburst chart

  • Tree charts represent a hierarchy in a tree-like structure. The Tree chart consists of a root node, line connections called branches that represent the relationships and connections between the members, and leaf nodes that do not have child nodes. The Tree chart was formerly an option in the Treemap chart.

    Tree chart

Week ending 16 October 2020

Change to Watson Machine Learning deployment frameworks

The following changes to deployment frameworks might require user action.

Support for Python 3.7

You can now select Python 3.7 frameworks to train models and run Watson Machine Learning deployments.

Deprecation of Python 3.6

Python 3.6 is being deprecated. Support will be discontinued on January 20, 2021. You can continue to use the Python 3.6 frameworks; however you will be notified that you should move to a Python 3.7 framework. For details on migrating an asset to a supported framework, see Supported frameworks.

Support for Spark 3.0 and new language versions

  • Spark 3.0

    • You can now select a Spark 3.0 environment to run notebooks with Python 3.7, R 3.6 and Scala 2.12 or to run notebook jobs.
  • New languages

    • Python 3.7

      You can select Python 3.7 environments to run Jupyter notebooks (including with GPU) in Watson Studio.

      Python 3.6 is being deprecated. You can continue to use the Python 3.6 environments; however you will be notified that you should move to a Python 3.7 environment. When you switch to Python 3.7, you might need to update code in notebooks if the versions of open source libraries that you used are different in Python 3.7.

    • Scala 2.12

      With the introduction of Spark 3.0, you can start using Spark with Scala 2.12 in notebooks and jobs. Again, you might need to update code in your notebooks if library versions that you used with Scala 2.11 are not compatible in Scala 2.12.

Support ends for SPSS Modeler runtime 18.1 and certain Python nodes

Support for SPSS Modeler flows trained with 18.1 and containing certain Python nodes is discontinued as of October 14, 2020. If your SPSS models uses any of these Python nodes, then you must retrain your models using Watson Studio Canvas or any tool that uses SPSS Modeler 18.2 version. For details, see Supported frameworks.

Support ends for deployments based on deprecated AutoAI images

Due to a known security vulnerability, AutoAI model deployments created using Watson Machine Learning on IBM Cloud prior to August 1, 2020 will be removed on November 1, 2020. If you have not already migrated and redeployed your AutoAI models, do so prior to November 1, 2020. For details, see Migrating Assets.

Data Refinery flows are supported in deployment spaces

You can now promote a Data Refinery flow from a project to a deployment space. Deployment spaces are used to manage a set of related assets in a separate environment from your projects. You can promote Data Refinery flows from multiple projects to a space. You run a job for the Data Refinery flow in the space and then use the shaped output as input for deployment jobs in Watson Machine Learning. For instructions, see Promote a Data Refinery flow to a space in Managing Data Refinery flows.

Week ending 9 October 2020

Time series library for notebooks

You can now use the time series library to perform operations on time series data, including segmentation, forecasting, joins, transforms and reducers. You can use the time series library functions in Python notebooks that run with Spark. See Time series library.

Week ending 2 October 2020

Updates to Watson Knowledge Catalog offering plans

Starting 1 October, 2020, the Watson Knowledge Catalog offering plans have these changes:

  • Lite plan: The maximum number of users is reduced to 2.
  • Standard plan: The maximum number of assets is increased to 1000.
  • Standard and Professional plans: The cost for the plan, extra users, and extra CUH is changing.

If you already have the Lite or Standard plan, your existing assets and catalog users remain unchanged.

Read the blog post.

End of migration period for Watson Machine Learning Lite plans

On October 1, 2020, the migration period ended for Watson Machine Learning Lite plan users with V1 machine learning service plans created before September 1, 2020. For details on the migration to new plans and service instances, see the what’s new entry for September 4, 2020.

Week ending 25 September 2020

Batch deployment available for AutoAI experiments

Starting with the V2 Watson Machinne Learning service instance and the V4 Watson Machine Learning APIs, rolled out on September 1, 2020, batch deployment is supported for AutoAI experiments. For details, see Creating a batch deployment.

New Databases for EDB service

You can now create the Databases for EDB service from the Cloud Pak for Data as a Service services catalog to access EDB Postegres Advanced Server. See Databases for EDB.

Week ending 18 September 2020

New jobs user interface for running and scheduling Data Refinery flows and notebooks

The user interface gives you a unified view of the job information.

Data Refinery flow job
Data Refinery job

Notebook job
Notebook job

You create the job for a Data Refinery flow or for a notebook directly in the user interface for each tool or from the Assets page of a project. See Jobs in a project.

Week ending 11 September 2020

Change to Watson Machine Learning service credentials

The new V2 Watson Machine Learning service instance rolled out on September 1 uses new, simplified authentication. Obtaining bearer tokens from IAM is now performed using a generic user apikey instead of a Watson Machine Learning specific apikey. It is no longer necessary to create specific credentials on the Watson Machine Learning instance, so the Credentials page was removed from the IBM Cloud services catalog. For details, see Authentication.

During the migration period, you can use existing Watson Machine Learning service credentials to access your legacy V1 service instance and assets. Lite users of instances provisioned before Sept 1st can keep using existing credentials during the migration period but cannot generate new credentials. Standard and Professional plan users can follow the steps in Generating legacy Watson Machine Learning credentials to create new credentials.

Spark 2.3 deprecation

Starting 1 October, 2020, you can no longer select a Spark 2.3 environment to run a notebook or job. Select a Spark 2.4 environment instead. Existing notebooks and jobs with Spark 2.3 environments will continue running until 30 November, 2020. After that you must select a different environment for the affected notebooks and jobs.

Week ending 4 September 2020

New Watson Machine Learning service plans

Watson Machine Learning released new plans on IBM Cloud. These new plans accommodate and provide entitlements for the newest features and patterns available to Watson Machine Learning users, starting on September 1, 2020. 

New sign-ups will receive the latest plans and API entitlements; no Watson Machine Learning instances that correspond to the older plans can be provisioned. By March 1st 2021, only Watson Machine Learning instances which correspond to the updated plans will be supported. The following sections introduce the new features and describe how to best plan your migration. 

Upgrading to Watson Machine Learning “v2” Instances

All Lite plan users are automatically upgraded from v1 to v2 service plans. Lite plan users can now call the v4 APIs or use the v4 Python client library to conduct machine learning model training, model saving, and deployment, and access the newest features such as runtime software specifications for your deployments. 

For Standard plan and Professional plan users, you can choose when to migrate your assets for use with the v2 machine learning service instance. Users of these plan instances will have more time to work with both the older and the newer API sets and plan instances until March 1 2021. For details on working with a deprecated service instance, see Generating legacy Watson Machine Learning credentials.

Note: During the migration period, you will not be charged for your usage associated with the new plan instances while your v1 plan instances are still active.

Get details about the Watson Machine Learning plans.

Full support for v4 APIs and an updated Python client library

The v4 APIs and Python client library are now generally available for use with the v2 service plans. The new APIs support features such as deployment spaces for organizing all of the assets required for running and managing deployment jobs, software specifications, and updated authentication. Note that support for v3 and v4 beta APIs ends on March 1, 2020. Review the differences between the v3, v4 beta, and v4 APIs.

Migration assistance for Watson Machine Learning 

Watson Machine Learning users can easily migrate their Watson Machine Learning repository assets, such as machine learning models to Watson Studio Projects with automated assistance from a graphical migration tool, or programmatically, using a dedicated set of APIs

Introducing deployment spaces

Deployment spaces let you deploy and manage models and other analytical assets such as data sources and software specifications in a separate environment from your projects. When your project assets are ready to deploy, you promote assets to your deployment space to configure deployments, test models and functions, consume scoring endpoints, and manage production jobs. Spaces, like projects, are collaborative, so you can invite others to collaborate and manage access for a space.

Watson Studio enhancements

Watson Studio now leverages the newest Watson Machine Learning APIs. Consequently, to take actions from the Watson Studio interface that require Watson Machine Learning, such as triggering AutoAI experiments, you must have a “v2” Watson Machine Learning instance associated with the project. Watson Studio projects that are still associated with an older Watson Machine Learning instance will display a message that instructs you to migrate your assets and associate a new v2 Watson Machine Learning instance. 

Additionally, starting on September 1st, Watson Studio users will be able to save models and other artifacts produced through use of Watson Machine Learning alongside other assets like notebooks in their Watson Studio project. For details on machine learning tools you can use to create project assets, see Machine Learning Overview.

Decision Optimization support for Watson Machine Learning

Decision Optimization supports all of the new features available with Watson Machine Learning, including using software and hardware specifications to configure optimization models and using deployment spaces to organize the assets required for deployment. For a complete list of changes, see Migrating from Watson Machine Learning API V4 Beta.

Watson Machine Learning migration action plan

Follow these steps to migrate assets and upgrade your service instance to take advantage of the new Watson Machine Learning plans and features.

  1. Review the updated Watson Machine Learning plans and consider which level of service is right for you.
  2. Migrate your assets, using the Migration Assistant tool from Watson Studio or using a programmatic solution.
  3. Start using your new machine learning service instance.
  4. Retrain models or update your Python functions, as needed.
  5. Create a deployment space and start to work with your migrated assets.
  6. Delete your old v1 service instance.

Spark 2.3 framework for Watson Machine Learning deprecated

Spark 2.3 framework for Watson Machine Learning is deprecated and will be removed on December 1, 2020. Use Spark 2.4 instead. For details, see Supported frameworks.

Compatibility issue for SPSS Modeler runtime 18.1 and older Python version

Support for SPSS Modeler flows trained with 18.1 and containing certain Python nodes is deprecated. For existing deployments which are using these nodes, you can continue to score the deployments till October 1, 2020. If the SPSS models uses any of these Python nodes, then it will require retraining the model using Watson Studio Canvas or any tool that uses SPSS Modeler 18.2 version. For details, see Supported frameworks.

Decision Optimization enhancements

You can now use these enhancements to Decision Optimization:

  • The Decision Optimization model builder now contains a new Overview pane which provides you with model, data and solution summary information for all your scenarios at a glance. From this view you can also open an information pane where you can create or choose your deployment space. See Overview pane.
  • To create and run Optimization models you must have both a Machine Learning service added to your project and a deployment space associated with your experiment.
  • You can now deploy models using Watson Machine Learning from the the Decision Optimization model builder scenario pane. See Scenario pane and Deploying a model using the user interface.
  • A new sample Extend software specification notebook is now available which shows you how to extend the Decision Optimization software specification (runtimes with additional Python libraries for docplex models). See Python client examples and download the sample from https://github.com/IBMDecisionOptimization/DO-Samples/tree/watson_studio_cloud/jupyter.
  • The Explore solution view of the model builder has been updated to show more information about the objectives/KPIs, solution tables, constraint or bounds relaxations or conflicts, engine statistics and log. See Explore solution view.
  • The Visualization view of the model builder now enables you to create Gantt charts for any type of data where it is meaningful and is no longer restricted to scheduling models only. See Visualizations view.

Translation of documentation

You can now read this documentation in these languages by setting your browser locale:

  • Brazilian Portuguese
  • Simplified Chinese
  • Traditional Chinese
  • French
  • German
  • Italian
  • Spanish
  • Japanese
  • Korean
  • Russian

Not all documentation topics are translated into all of these languages.

Jupyter notebook editor upgraded

The Jupyter notebook editor in Watson Studio is upgraded from Jupyter notebook version 6.0.3 to version 6.1.1. For a list of changes, including keyboard short-cut key changes, see Jupyter Notebook Changelog.

Week ending 21 August 2020

Enhanced Cognos Dashboards

You can now use these enhancements to dashboards a project:

  • New visualizations, including Waterfall, KPI widget, and an enhanced cross-tab.
  • A contextual toolbar and a data mapping panel.

See Cognos Dashboards.

Databases for MongoDB service

You can now provision a Databases for MongoDB service from the Services catalog.

See Databases for MongoDB.

Use Data Refinery to change the decimal and thousands grouping symbols in all applicable columns

When you use the Convert column type GUI operation to detect and convert the data types for all the columns in a data asset, you can now also choose the decimal symbol and the thousands grouping symbol if the data is converted to an Integer or to a Decimal data type. Previously you had to select individual columns to specify the symbols.

Convert column type GUI operation for numbers

See Convert column type in GUI operations in Data Refinery, under the FREQUENTLY USED category.

Week ending 14 August 2020

Google Cloud Platform integration

You can now configure an integration with the Google Cloud Platform (GCP) to access data sources from GCP.

See Integrating with Google Cloud Platform.

Week ending 31 July 2020

Security update for AutoAI deployments

There is a known security vulnerability with the image used for AutoAI model deployments created using Watson Machine Learning on IBM Cloud prior to August 1, 2020. The image vulnerability has been addressed, so deployments of models created with AutoAI experiment after August 1, 2020 are not impacted. The following remedies are available:

For Lite plan users

Impacted AutoAI deployments will be deprecated (stop working) on the September 1st, 2020. You can redeploy your models in August, then migrate them to a new deployment space in September, 2020.

For Standard and Professional plans users

Impacted AutoAI deployments will be deprecated (stop working) on November 1st, 2020. You can redeploy your models in August, then migrate them to a new deployment space in September-October, 2020.

Removal of Neural Network Modeler and SparkML modeler

Both the beta Neural Network Modeler and the beta SparkML modeler tools are removed from Watson Studio.

Week ending 24 July 2020

Cloud Pak for Data as a Service is GA!

Cloud Pak for Data as a Service is now generally available. Sign up for Cloud Pak for Data as a Service at dataplatform.cloud.ibm.com.

Learn more about Cloud Pak for Data as a Service.

Read the blog.

Subscribe to Cloud Pak for Data as a Service

You can now upgrade your IBM Cloud account from a Lite plan by subscribing to Cloud Pak for Data as a Service. With a subscription, you commit to a minimum spending amount for a certain period of time and receive a discount on the overall cost compared to a Pay-As-You-Go account.

See Upgrading to a Cloud Pak for Data as a Service subscription account.

Learn quickly with a guided tutorial

You can quickly learn how to use tools in projects by taking a guided tutorial. A guided tutorial starts with a sample project that contains the data and anything else you need. After you create the project, the tutorial starts and guides you through the steps to solve a specific business problem.

Click the Take a guided tutorial link on your home page.

New services catalog

You can now create services IBM Cloud services that work with Cloud Pak for Data as a Service from the new services catalog. Select Services catalog from the main menu. You can see all your services by selecting the Your services option.

See IBM Cloud services.

Integrate with other cloud platforms

You can now configure integration with other cloud platforms to simplify creating connections to data sources in those cloud platforms in projects and catalogs. Select Cloud integration from the main menu. 

See Integrating with other cloud platforms.

Use the new Data Refinery “Union” operation to combine the rows from two data sets that share the same schema

Data Refinery Union operation

The Union operation is in the ORGANIZE category. For more information, see GUI operations in Data Refinery.

Automatically detect and convert date and timestamp data types

When you open a file in Data Refinery, the Convert column type GUI operation is automatically applied as the first step if it detects any non-string data types in the data. Now date and timestamp data are detected and are automatically converted to inferred data types. You can change the automatic conversion for selected columns or undo the step. For information about the supported inferred date and timestamp formats, see Convert column type in GUI operations in Data Refinery, under the FREQUENTLY USED category.

Week ending 17 July 2020

Starting on 21 July, you’ll see some changes to the Watson Studio, Watson Machine Learning, and Watson Knowledge Catalog services.

What is changing
Your home page will look different and shows more information.
You’ll have some new options on the main menu: 
  • A Services catalog option where you can create IBM Cloud services that work with Watson Studio and Watson Knowledge Catalog. See IBM Cloud services.
  • A Cloud integration option for configuring integrations to other cloud platforms to simplify creating connections to data sources in those cloud platforms. See Integrating with other cloud platforms.
What might change
Your product brand might change to Cloud Pak for Data. If you provisioned any services that work with Watson Studio besides Watson Machine Learning and Watson Knowledge Catalog, you’ll see Cloud Pak for Data as the product brand. See Relationships between the Watson Studio and Watson Knowledge Catalog services and Cloud Pak for Data as a Service.
What isn’t changing
Your service plans and billing for your IBM Cloud services remain the same.

See Cloud Pak for Data as a Service overview.

Week ending 10 July 2020

Upcoming removal of Neural Network Modeler and SparkML modeler

Both the beta Neural Network Modeler and the beta SparkML modeler tools will be removed from Watson Studio on July 31.

If you use the Neural Network Modeler, similar functionality will be added to AutoAI in the future. Download your existing Neural Network Modeler flows as TensorFlow, Keras, Caffe, or PyTorch code.

If you use the SparkML modeler, you can find similar drag and drop visual modeling functionality in the SPSS Modeler. Alternatively, you can to process big data with powerful Spark environments in Jupyter notebooks. Export your existing SparkML modeler flows as compressed files.

Easily add data from a Cognos Analytics connection to a notebook

You can now add data from a Cognos Analytics connection by using the Insert to code function for the connection within a notebook. See Data load support.

The Lineage page is renamed to Activities

The Lineage page that you can see when you open a data asset in a catalog or project is now called Activities. The information shown on this page remains the same.

Week ending 3 July 2020

Upcoming Watson Machine Learning plan changes

As part of a broader Watson Machine Learning update coming in September, Lite plan users will be automatically upgraded to new “v2” plan instances on September 1. With this automatic upgrade, Lite plan users can use the new v4 Watson Machine Learning APIs to support new features for building and deploying assets. Additionally, Lite plan users can upgrade to the Standard plan instances free of charge before September 1 to retain entitlement to the older API set during the migration period. With the Standard plan, you pay only for capacity unit hours and predictions, but you can take other actions, such as reading model information, free of charge.

Starting September 1, Watson Machine Learning Lite plans will include a maximum of 20 capacity unit hours, instead of 50.

Week ending 12 June 2020

Perform aggregate calculations on multiple columns in Data Refinery

Now you can select multiple columns in the Aggregate operation. Previously all aggregate calculations applied to one column.

Aggregate operation with two columns

For more information, see Aggregate in GUI operations in Data Refinery, under the ORGANIZE category.

Filter values in a Boolean column in Data Refinery

You can now use these operators in the Filter GUI operation to filter Boolean (logical) data:
Is false
Is true

Filter Boolean GUI operation

For more information, see Filter in GUI operations in Data Refinery, under the FREQUENTLY USED category.

In addition, a new template for filtering by Boolean values has been added to the filter coding operation.

filter(`<column>`== <logical>) 

For more information about the filter templates, see Interactive code templates in Data Refinery.

Week ending 05 June 2020

SPSS Modeler flow properties

You can now set flow properties. For details, see Setting properties for flows.

Week ending 22 May 2020

AutoAI Auto-generated WML notebooks and SDK available in beta

You now have two options for saving an AutoAI pipeline as a notebook:

  • WML notebook - Work with a trained model in an annotated notebook. You can review and update the code, view visualization, and deploy the model with Watson Machine Learning.
  • AutoAI_lib notebook - View the Scikit-Learn source code for the trained model in a notebook. Does not require Watson Machine Learning.

Additionally, the Watson Machine Learning Python client has been extended to include an SDK for the WML notebook. For details, see Saving an AutoAI generated notebook. Note: These features are being offered as a beta and are subject to change.

Changes to the Watson Studio plans

Starting on May 19, 2020, the Watson Studio plans has the following changes:

  • All plans: The free compute environment is no longer available. All your compute usage now consumes capacity unit hours. The Lite plan has a limit of 50 capacity unit hours per month.
  • Lite and Standard plans: Compute environments provided by associated services, such as IBM Analytics Engine, are now available only with the Enterprise plan.
  • Lite plan: Only the smallest size Spark environments are now available for Lite plans, with 2 executors that each have 1 vCPU and 4 GB RAM, and one driver that has 1 vCPU and 4 GB RAM. Large compute environments with 8 or more vCPU are no longer available for the Lite plan.
  • Lite plan: The ability to export projects now requires the Standard or Enterprise plan.

If you have an analytical asset, for example, a notebook, or a job that uses an environment that is no longer available, you will see a message to select a different environment. See Changing your environment.

If you need more compute resources, upgrade to the Watson Studio Standard or Enterprise plan. See Upgrading Watson Studio.

This change was first announced on March 17 here and in this blog post.

Week ending 01 May 2020

More Decision Optimization compute options

You now have more options that cost less when you run Decision Optimization jobs. You can choose from new, more powerful Decision Optimization environments. The basic Decision Optimization compute environment now consumes only five capacity unit hours (CUH) instead of 20 CUH. The new environments consume 6-13 CUH.

Read the blog.

“PureData System for Analytics” connection renamed to “Netezza (PureData System for Analytics)”

The PureData System for Analytics connection is now the Netezza (PureData System for Analytics) connection. This change is to reflect the announcement of the new Netezza Performance Server for on premises and Cloud. Your previous settings for a connection to PureData System for Analytics remain the same. Only the connection name changed.

New visualization charts in Data Refinery

Data Refinery introduces six new charts. To access the charts, click the Visualizations tab in Data Refinery, and then select the columns to visualize. The chart automatically updates as you refine the data.

  • Bubble charts display each category in the groups as a bubble.

    Bubble chart

  • Circle packing charts display hierarchical data as a set of nested areas.

    Circle packing chart

  • Multi-charts display up to four combinations of Bar, Line, Pie, and Scatter plot charts. You can show the same kind of chart more than once with different data. For example, two pie charts with data from different columns.

    Multi-chart

  • Radar charts integrate three or more quantitative variables that are represented on axes (radii) into a single radial figure. Data is plotted on each axis and joined to adjacent axes by connecting lines. Radar charts are useful to show correlations and compare categorized data.

    Radar chart

  • Theme river charts use a specialized flow graph that shows changes over time.

    Theme river chart

  • Time plot charts illustrate data points at successive intervals of time.

    Time plot chart

Week ending 24 April 2020

Synchronizing assets with Information Governance Catalog is discontinued

You can no longer automatically synchronize data assets between Information Governance Catalog and Watson Knowledge Catalog.

Week ending 17 April 2020

GPU environments for running notebooks are GA

GPU environments for running Jupyter notebooks with Python 3.6 are now generally available for the Watson Studio Standard and Enterprise plans. GPU environments are available in the Dallas IBM Cloud service region only.

With GPU environments, you can reduce the training time needed for compute-intensive machine learning models you create in a Jupyter notebook with Python 3.6. With more compute power, you can run more training iterations while fine-tuning your machine learning models.

See GPU environments.

Week ending 3 April 2020

Changes to Watson Studio Enterprise plan

On April 1, 2020, the Watson Studio Enterprise plan has the following changes:

  • The number of free authorized users is now 10.
  • The cost of adding extra authorized users is reduced by 50%.
  • The compute usage rate is reduced to $0.40 USD per capacity unit hour used beyond the 5000 CUH per month that are included in the plan.

Read the blog post.

Week ending 27 March 2020

AutoAI Auto-generated notebooks available in beta

Save an AutoAI pipeline as a notebook so you can view all of the transformations that went into creating the pipeline. Use the autoai-lib reference as a guide to the transformations. This feature is being offered as a beta and is subject to change. For details, see Saving an AutoAI generated notebook.

Week ending 20 March 2020

Upcoming changes to Watson Studio Lite and Enterprise plans

On May 17, 2020, the Watson Studio Lite plan will have the following changes:

  • Free compute environment will not be available. All Lite plan users will have a limit of 50 capacity unit hours of compute usage per month.
  • Large compute environments with 8 or more vCPU will not be available.
  • Only the smallest size Spark environments will be available, with 2 executors that each have 1 vCPU and 4 GB RAM, and one driver that has 1 vCPU and 4 GB RAM.
  • Compute environments provided by associated services, such as IBM Analytics Engine, will be available only with the Enterprise plan.
  • The ability to export projects will not be available.

If you need more compute resources, upgrade to the Watson Studio Standard or Enterprise plan. See Upgrading Watson Studio.

On April 1, 2020, the Watson Studio Enterprise plan will have the following changes:

  • The number of free authorized users will be doubled, to 10.
  • The cost of adding extra authorized users will be reduced by 50%.
  • The compute usage rate will be reduced from $0.50 USD to $0.40 USD per capacity unit hour used beyond the 5000 CUH per month that are included in the plan.

Read the blog post.

Upcoming changes to Watson Machine Learning GPU plans

Starting on May 1, 2020, Watson Machine Learning will update the capacity units per hour for GPU capacity types, as follows:

Capacity Type Capacity units required per hour
1 NVIDIA K80 GPU 3
1 NVIDIA V100 GPU 10

Capacity units required per hour of multiple GPUs is calculated by the capacity units per hour on single GPU times the total number of GPUs. For details, read the blog post.

Week ending 13 March 2020

Custom security policies available for restricting downloads

By default, Watson Machine Learning does not restrict external sites users can access as part of operations such as downloading data source files or installing Python library packages. If you would like to limit access to a list of approved sites, contact IBM Cloud support to request a custom network policy for your organization.

New capabilities in AutoAI

The following features are new or enhanced in AutoAI:

  • The limit on the size of a data source for an AutoAI experiment is increased from 100 MB to 1 GB.
  • The number of pipelines generated for an experiment is increasing from four to eight, based on the two top performing algorithms. You can now also increase the number of top performing algorithms to use for generating pipelines if you want to view and compare more pipelines. Each algorithm creates four optimized pipelines. For details, see Building an AutoAI model.

Week ending 06 March 2020

Updates to Watson Machine Learning frameworks

Support is now available for TensorFlow 1.15 and Keras version 2.2.5 for training and deploying models. Due to a security vunerability with certain TensorFlow versions, support for TensorFlow 1.13 and 1.14 along with Keras 2.1.6 and Keras 2.2.4 will be deprecated. Users will need to upgrade to Keras 2.2.5 and switch to TensorFlow 1.15 backend. For details on the changes, view this announcement. For the complete list of supported frameworks, see this topic.

Week ending 07 February 2020

New Spark and R runtime enabled for jobs in Data Refinery

You can now select Default Spark 2.4 & R 3.6 when you select an environment definition for a new job. The new runtime uses the same capacity unit hours (CUHs) as the Default Spark R 3.4 (which is Spark 2.3) runtime.

Default Spark 2.4 & R 3.6 in a job

SAV files

SPSS Statistics .sav data files are now supported for import or export in SPSS Modeler.

Exercise more control over pipeline creation for an AutoAI experiment

You now have the option of specifying which algorithms AutoAI should consider for an experiment and how many of the top performing algorithms to use for creating pipelines. For details, see Building an AutoAI model.

Week ending 10 January 2020

“Hortonworks HDFS” connection renamed to “Apache HDFS”

The Hortonworks HDFS connection is now the Apache HDFS via the WebHDFS API connection. Your previous settings for connections to Hortonworks HDFS remain the same. Only the connection name has changed.

Geospatio-temporal library for notebooks

You can use the geospatio-temporal library to expand your data science analysis to include location analytics in your Python notebooks that run with Spark. See Using the geospatio-temporal library.

Week ending 13 December 2019

“Object Storage OpenStack Swift (Infrastructure)” connection is discontinued

Support for the Object Storage OpenStack Swift (Infrastructure) connection is discontinued. The Object Storage OpenStack Swift (Infrastructure) connection is no longer in the user interface.

Week ending 6 December 2019

Updates to supported frameworks for Watson Machine Learning

Support is now available for PyTorch version 1.1. For the complete list of supported frameworks, see this topic.

Due to security vulnerabilities with several TensorFlow versions, Watson Machine Learning has added support for TensorFlow version 1.14 as well as 1.13 and has removed support for all unsecure TensorFlow versions, including 1.5 and 1.11. For details, read the blog announcement. For help changing to a supported runtime, see the TensorFlow compatibility guide.

Synthesized Neural Networks (NeuNetS) beta tool removed

Synthesized Neural Networks (NeuNetS) model building tool is removed from Watson Studio until it gets merged with AutoAI. For details, see this blog post.

Week ending 22 November 2019

Synthesized Neural Networks (NeuNetS) merging with AutoAI

In 2020, the Synthesized Neural Networks (NeuNetS) model building tool (currently in beta) will be merged with AutoAI for a unified, automated model-building experience. Starting on December 6, 2019, the NeuNetS tool will be removed from the Watson Studio interface until the merge is complete. Please remove your NeuNetS models prior to that date and migrate them to newer versions of Keras models. For details on the merge of NeuNetS with AutoAI, see this blog post.

Data Refinery removes a restriction on source data

  • Column names can now include periods.

Week ending 15 November 2019

Multibyte character support

Multibyte characters are now fully supported in these areas of Watson Knowledge Catalog:

  • The names or descriptions of data policies, rules, or categories.
  • The names, business definitions, or descriptions of business glossary terms.
  • Asset tags.

However, Data Refinery does not support multibyte characters in user-input fields and some fields allow multibyte characters but do not display them correctly.

Week ending 8 November 2019

AutoAI enhancements

New features in AutoAI give you greater control over how your model pipelines are generated and greater insight into the automated process. For example, a new visualization shows you the relationships between pipelines as well as what makes each one unique. New experiment settings let you choose specific algorithms for AutoAI to consider for model selection. You can also exercise more control over how your data is used to train the pipelines. For details, see Building an AutoAI model.

Week ending 1 November 2019

Watson Machine Learning support for TensorFlow 1.14

Due to security vulnerabilities with several TensorFlow versions, Watson Machine Learning has added support for TensorFlow version 1.14 as well as 1.13 and is deprecating support for all unsecure TensorFlow versions, including 1.5 and 1.11. For details, read the blog announcement. For help changing to a supported runtime, see the TensorFlow compatibility guide.

Week ending 18 October 2019

Object detection tool

The Object Detection tool for the Visual Recognition service is now generally available. To view a video that introduces object detection, see Custom object detection models.

Week ending 04 October 2019

Data Refinery automatically detects and converts data types

Previously when you opened a file in Data Refinery, for most file types all the columns were interpreted as the string data type. Now the Convert column type GUI operation is automatically applied as the first step in the Data Refinery flow. The operation automatically detects and converts the data types to inferred data types (for example, to Integer, Boolean, etc.) as needed. This enhancement will save you a lot of time, particularly if the data has many columns. It is easy to undo the automatic conversion or to edit the operation for selected columns.

Automatic detection and conversion of data type

For more information, see Convert column type in GUI operations in Data Refinery, under the FREQUENTLY USED category.

Select the runtime for a Data Refinery flow with the Jobs interface

Previously you could select the default runtime for a Data Refinery flow in the DATA REFINERY FLOW DETAILS pane in Data Refinery (accessed from the Info pane Details tab). The SELECT RUNTIME selection has been removed. Instead, select the runtime when you save a job to run the Data Refinery flow. The runtime for any previously scheduled jobs remains unchanged. For information about jobs, see Jobs in a project.

Confirm the stop words removed in a Data Refinery flow

Use the Tokenize GUI operation to test the words you remove from a selected column with the Remove stop words GUI operation. For information, see Remove stop words in GUI operations in Data Refinery, under the NATURAL LANGUAGE category.

Week ending 20 September 2019

Decision optimization available on all plans

Decision Optimization is now available on the Standard plan as well as the Lite and Enterprise plans. For details, see the announcement.

Removal of Apache Spark as a Service

If you were using Spark as a Service Enterprise plan or Lite plan from Watson Studio, you must switch to using built-in Spark environments. Spark as a Service is no longer supported.

For details, read this blog post: Deprecation of Apache Spark (Lite Plan).

Use built-in Spark environments instead. See Spark environments.

Week ending 6 September 2019

Changes to the Community

The Watson Studio Community is now split into two sites to better serve your needs:

  • The Gallery contains sample data sets, notebooks, and projects that you can add to Watson Studio directly. You can access the Gallery from the main menu.
  • The Community contains articles, blog posts, tutorials, events, and discussions. You can access it here.

Week ending 30 August 2019

Decision Optimization model builder beta

The Decision Optimization model builder is now in beta. With the Decision Optimization model builder, you can create several scenarios, using different data sets and optimization models. This allows you to create and compare different scenarios and see how big an impact changes can have on a given problem.

The model builder helps you:

  • Select and edit the data relevant for your optimization problem.
  • Run optimization models
  • Investigate and compare solutions for multiple scenarios.
  • Create, import, edit, and solve Python and OPL models.
  • Import and export Python models to and from Jupyter notebooks.
  • Easily create and share reports with tables, charts and notes using widgets provided in the visualization editor.

See Decision Optimization.

Support for R 3.6 and deprecation of R 3.4

You can now use R 3.6 runtimes in Watson Studio for notebooks and AutoAI. Support for R 3.4 in Watson Studio is ending on October 30, 2019. When you upgrade a notebook from R 3.4 to R 3.6, you might need to make code changes because some open source libraries versions might be different.

Read the announcement.

Reminder: Support ending for Python versions 3.5 and 2.7

Support for Python versions 3.5 and 2.7 in Watson Studio ended on August 28, 2019. Support in Watson Machine Learning is ending on September 9, 2019. If you have not already done so, migrate your assets and models to run with Python version 3.6. For more information, see the announcements for Watson Studio and Watson Machine Learning.

Week ending 16 August 2019

Open beta for Object Detection service

A new component for the Watson Studio Visual Recognition service lets you build a model that can identify objects within images. For details, see Creating custom Object Detection models.

Reminder: Support ending for Python versions 3.5 and 2.7

Support for Python versions 3.5 and 2.7 is ending on August 28, 2019. If you have not already done so, migrate your assets and models to run with Python version 3.6. For more information, see the announcements for Watson Studio and Watson Machine Learning.

Support for TensorFlow 1.13

Due to security vulnerabilities with several TensorFlow versions, Watson Machine Learning has added support for TensorFlow version 1.13 and is deprecating support for all unsecure TensorFlow versions, including 1.5 and 1.11. For details, read the blog announcement.

Change to Watson Machine Learning V4 API date/time format

The date/time format returned from the Watson Machine Learning version 4 API has changed. This change will impact users who are using the V4 API-supported Watson Machine Learning Python client for creating deployments or jobs and parsing the date/time fields in deployment or jobs-related metadata.

The date format previously returned in a GET response of /v4/deployments was:

yyyy-MM-dd’T’HH:mm:ssZZZZ

The new format is:

yyyy-MM-dd’T’HH:mm:ss.SSS’Z’

Faster SPSS Modeler flows

SPSS Modeler flows now run faster because their environment runtime is more powerful. The environment runtime for running SPSS Modeler flows is now 4 vCPU and 16 GB RAM instead of 2 vCPU and 8 GB RAM. The new environment runtime consumes 2 capacity units per hour.

RStudio XXS environment runtime removed

The smallest RStudio environment runtime, Default RStudio XXS, with 1 vCPU and 5 GB RAM, is no longer available. Use the more powerful RStudio environment runtimes.

Week ending 09 August 2019

Deadline for migrating notebook schedules to jobs extended to August 30

You now have until Friday, August 30, 2019 to migrate your notebook schedules to the new jobs interface.

Week ending 02 August 2019

End of support for Watson Machine Learning JSON Token Authentication service

Deprecation of the Watson Machine Learning JSON Token Authentication service was announced on April 23, 2019. If you interact with the Watson Machine Learning service programmatically, via API, Python client, or command line interface, you should be using IBM Cloud VCAP credentials, as described in Watson Machine Learning authentication.

Retirement of Watson Machine Learning Model Builder

Watson Machine Learning Model Builder is no longer available for training machine learning models. Models trained with Model Builder and deployed to Watson Machine Learning will continue to be supported, but no new models can be trained using Model Builder. Instead, use AutoAI for training classification and regression models. Read about the announcement, or learn more about AutoAI.

Reminder: Data Refinery schedules will be discontinued August 12

You must migrate Data Refinery flow schedules to the new jobs interface before August 12, 2019.

Week ending 26 July 2019

General availability for Decision Optimization in notebooks

Decision Optimization is now generally available in Watson Studio notebooks when you select Python runtime environments. See Notebook environments.

AutoAI updates

These enhancements are new for AutoAI:

  • The data source you use to create an AutoAI model can now be output from Watson Studio Data Refinery.
  • After adding data to the AutoAI Experiment builder, you can preview the data without leaving the tool. You can also adust the percentage of data that is held out to test the performance of the model, from 0 to thirty percent.
  • For binary classification models, you can edit the positive class.

For details on these updates, see Building an AutoAI model

Details on how feature engineering transformations are applied are documented in AutoAI implementation details

Week ending 19 July 2019

Switch to Python 3.6 environments

The default Python environment version in Watson Studio is now 3.6. Python 2.7 and 3.5 are being deprecated and will no longer be available after August 28, 2019. When you switch from Python 3.5 or 2.7 to Python 3.6, you might need to update your code if the versions of open source libraries that you use are different in Python 3.6. See Changing the environment.

Read this blog post: Python version upgrade in Watson Studio Cloud

Use a form to test an AutoAI model deployment

You can now test a deployed AutoAI model using an input form as an alternative to entering JSON code. Enter values the form fields, then click Predict to see the prediction.
Prediction from test data

For details see Deploying an AutoAI model

Simplified project creation

When you create a project, you can now choose from creating an empty project and creating a project from a file or a sample. All tools are available in all projects.

See Set up a project and Importing a project.

Add dashboards to project export file

You can now include dashboards when you export a project ZIP file to your desktop. See Exporting projects.

RStudio environments

When you launch RStudio in a Watson Studio project, you can now select the RStudio environment runtime in which to launch RStudio by hardware size. For information about the RStudio environments, see RStudio jobs in a project.

Week ending 12 July 2019

New jobs user interface for running and scheduling Data Refinery flows and scheduling notebooks

The jobs user interface provides a new way to run or schedule a Data Refinery flow or to schedule a notebook. From the project page, click the Jobs tab to view all the jobs in a project and their run details. Now you can create multiple jobs for the same asset, for example a job with different runtimes or different schedules. You can also create a job from Data Refinery or a notebook. For information about jobs, see Jobs in a project.

Important
You must manually migrate your current Data Refinery flow schedules to the new jobs interface before August 12, 2019. Migrate your notebook schedules before August 30, 2019.

New default runtime for Data Refinery

The new default runtime for Data Refinery is Default Data Refinery XS. Any Data Refinery flow runs previously set to None - Use Data Refinery flow Default will now use this new runtime. Like the Spark R 3.4 runtime, the Default Data Refinery XS runtime is HIPAA ready.

Default Data Refinery XS environment

You can also select this runtime when you create a job.

Default Data Refinery XS in a job

See Data Refinery environments.

Working in Data Refinery consumes CUH

When you create or edit a Data Refinery flow, the runtime consumes capacity units per hour. The runtime automatically stops after an hour of inactivity.
Important: You can manually stop the runtime on the Environments page in your project to stop consuming CUH. See Data Refinery environments.

New way to open a Data Refinery flow from the project page

To access a Data Refinery flow from the project’s Assets page, click the Data Refinery flow’s name. (Previously, you accessed the Data Refinery flow from a Refine option under the ACTIONS menu.)

Changing the source of a Data Refinery flow is now in the Data Refinery steps

To change the source of a Data Refinery flow, click the edit icon next to Data Source in the Steps pane. (Previously, you changed the source from the Summary page.)

Edit source

As before, for best results, the new data set should have a schema that is compatible to the original data set (for example, column names, number of columns, and data types). If the new data set has a different schema, operations that won’t work with the schema will show errors. You can edit or delete the operations, or change the source to one that has a more compatible schema.

Project readme included in project export and import

Before you export a project, you can add a brief description of the analytics use case of the included assets and the applied analysis methods to the project readme and this readme is now included in the project export. When you import a project, you can check the readme for a short description of the project’s intent on the project’s Overview page.

Watson Analytics connector is discontinued

The Watson Analytics connector has been removed from the list of data sources on the New connections page.

Snowflake connector available

Projects and catalogs now support connections to a Snowflake database, enabling you to store and retrieve data there.

Week ending 05 July 2019

Creating a project from a sample

If you are new to Watson Studio and are looking for how to use data assets in tools, such as notebooks to prepare data, analyze data, build and train models, and vizualize analysis results, you can now create a project from a sample. See Importing a project from a sample.

Week ending 28 June 2019

How to choose a tool in Watson Studio

You can now find which tool you need to use by matching your type of data, what you want to do with your data, and how much automation you want.

See Choosing a tool.

Streams flow

MQTT source operator, in addition to the message, now also provides the metadata attribute event_topic, for each event.

Week ending 21 June 2019

Decision Optimization

Decision Optimization gives you access to IBM’s world-leading solution engines for mathematical programming and constraint programming. Use this sophisticated prescriptive analytics technology, which can explore a huge range of possible solutions before suggesting the best way to respond to a present or future situation. With Decision Optimization, you can:

  • Start with a business problem, such as planning, scheduling, pricing, inventory, or resource management.
  • Create an optimization model, which is the mathematical formulation of the problem that can be interpreted and solved by an optimization engine. The optimization model plus the input data creates a scenario instance.
  • Run the Decision Optimization engine (or solver) to find a solution, a set of decisions that achieves the best values of the goals and respects limits and constraints imposed. Metrics measure the quality of the solution in terms of the business goals.
  • Use Watson Machine Learning to deploy the solution and make it available to business users via a business application. Usually, the solution and goals are summarized in tabular or graphical views that provide understanding and insight.

For details on creating a prescriptive analytics model, see Decision Optimization.

For details on deploying a solutions, see Decision Optimization Deployment.

AutoAI experiments

  • You can now create an AutoAI experiment from a sample file so you can see how AutoAI analyzes and transforms data, then creates model candidate pipelines for you to review and compare without having to upload your own data.
  • Follow the steps in Creating an AutoAI experiment from sample data to learn how to deploy and score a model created from the Bank marketing sample data set.
  • AutoAI models saved as Watson Machine Learning assets are now only available in the project in which they were created. AutoAI models created prior to this update are available in other projects that share the same machine learning instance.

Preview of new Python client library with Watson Machine Learning v4 API

A new version of the Python client library is in the works to support new features in Watson Machine Learning. This Python client, built on version 4 of the Watson Machine Learning APIs, is available as a preview to support Decision Optimization and AutoAI experiments.

For details on installing the new Python client and accessing the associated documentation, see Python client.

Streams flow

In addition to the JSON message, Watson IoT source operator now also provides the following metadata attributes for each event: event_typeId, event_deviceId, and event_eventId.

Week ending 07 June 2019

Cognos Analytics connector available

Projects and catalogs now support connections to Cognos Analytics, enabling you to store and retrieve data there.

Enhancements to AutoAI Experiments

A new tutorial walks you through the process of building, deploying, and scoring a binary classification model using AutoAI.

The following enhancements make it easier for you to review AutoAI model pipelines:

  • Updated designs make it faster for you to see the details for a pipeline
  • When you expand a pipeline in the leaderboard to review details you can now view hold-out and training data scores

Week ending 31 May 2019

Build models with AutoAI

AutoAI in Watson Studio automatically analyzes your data, selects a model type, applies estimators, and generates candidate model pipelines customized for your predictive modeling problem. Results are dislayed in a leaderboard, which is a table showing the list of automatically generated candidate models, as pipelines, ranked according to the specified criteria.

AutoAI provides a view into the estimators and hyper-parameter optimization applied to each pipeline so you can have confidence in how each pipeline is generated. After viewing and comparing pipelines, you can save one as a model that can be deployed and tested. For details on building models using AutoAI, see AutoAI overview

Week ending 24 May 2019

New SPSS modeler nodes

You can use these new nodes in SPSS Modeler:

  • Reprojection node: A field operations node that changes the coordinate system of fields in the geographical coordinate system to the projected coordinate system. See Reproject node.
  • Space-Time-Boxes node: A record operations node that represents a regularly shaped region of space and time as an alphanumeric string. See Space-Time-Boxes node.
  • Spatio-Temporal Prediction node: A modeling node that provides statistical techniques that you can use to forecast future values at different locations and explicitly model adjustable factors to perform what-if analysis. See Spatio-Temporal Prediction node.

New right-click option for SPSS Modeler nodes

Previously, when you right-clicked a node and selected Preview, a Data tab, Profile tab, and Visualizations tab opened—allowing you to examine your flow’s data in various ways. Now when you select Preview, you get a snapshot of your data that loads more quickly. Use the new right-click option called Profile to work with the full features such as the Visualizations tab.

Profile and Preview selections

Week ending 17 May 2019

Spark runtimes for Data Refinery flows

You can now use Spark runtimes with Data Refinery:

  • Spark R 3.4 environments for running Data Refinery flows are now generally available and are HIPAA ready. When you run a Data Refinery flow, you can select to use the preset Default Spark R 3.4 environment or configure your own Spark environment with the hardware size you need for your workload. You can’t select a Spark R 3.4 environment to schedule a Data Refinery flow run.

    Spark environment selection in the Data Refinery flow details page

    For information and instructions, see Spark environments for Data Refinery.

  • The None - Use Data Refinery Default runtime is deprecated. However, you can still use this runtime to run Data Refinery flows that operate on small data sets and to schedule Data Refinery flow runs.

Week ending 10 May 2019

Streams flow

  • Added the Binary data type to support use cases, such as processing, scoring and classifying binary data, images, video, and audio using pre-trained machine learning models.
  • Option of ingesting raw data without built-in JSON parsing in Event Streams, Kafka, and HTTP source operators.
    • Ability to have your own custom parsing by appending a Code operator.
    • Ability to ingest binary data (for example, images, video, and audio).
  • Option of writing raw data without built-in formatting as JSON, in Event Streams and Kafka target operators.
    • Ability to have your own custom formatting by inserting a Code operator.
    • Ability to write binary data (for example, images, video, and audio).
  • Optional metadata attributes in Event Streams and Kafka source operators: event_topic, event_offset, event_timestamp, event_partition.

Week ending 3 May 2019

Watson Studio Local 2.0 GA

Watson Studio Local 2.0 is now generally available for when you need the functionality of Watson Studio on your private cloud. See Watson Studio Local 2.0 documentation.

For a comparison of features between different deployment environments of Watson Studio, Watson Machine Learning, and Watson Knowledge Catalog, see Feature differences between deployment environments.

Deprecation of Apache Spark Lite service

You can no longer associate an Apache Spark Lite service with a project. Apache Spark Lite services will be deleted on June 28, 2019. Read this blog post: Deprecation of Apache Spark (Lite Plan).

Use built-in Spark environments instead. See Spark environments.

Week ending 25 April 2019

Migration of Object Storage OpenStack Swift projects

You might not be able to download assets from any remaining projects that use Object Storage OpenStack Swift. If you have trouble downloading data from a project with Object Storage OpenStack Swift so that you can migrate the project to IBM Cloud Object Storage, open a ticket with IBM Support.

Attribute classifiers renamed to data classes

The attribute classifiers that characterize the contents of columns in relational data assets are now called data classes. Data classes are automatically assigned to columns during profiling. See Profiles.

Week ending 12 April 2019

Export a project to desktop

You can now share project assets with others by exporting your project. The project assets that you select are downloaded as a project ZIP file to your desktop.

Import a project from desktop

When you create a project, you can select the import project starter that enables importing assets from another project to use in a new project.

Improved search filtering in Watson Knowledge Catalog

When you search for assets in a catalog, the search filters are now directly below the search field. You can filter on tags, business terms that are assigned to data assets, and asset types. The list of tags is now sorted alphabetically.

Searching for assets in a catalog with filters

Week ending 5 April 2019

Data Refinery: Column-action operations remain focused on the column

When you click a column in Data Refinery and select an operation from the actions menu (three vertical dots), the focus remains on that column. Previously the focus always shifted to the first column. This enhancement is especially useful when you are working on wide tables.

Week ending 22 March 2019

Import a streams flow from URL

You can now also import a streams flow from a URL, in addition to importing from a file.

Week ending 15 March 2019

Google Cloud Storage connector available

Projects and catalogs now support connections to Google Cloud Storage, enabling you to store and retrieve data there.

Week ending 1 March 2019

Migrate Watson Studio from Cloud Foundry to a resource group

You can migrate your Watson Studio service instance from a Cloud Foundry org and space to a resource group in IBM Cloud. Resource groups include finer-grained access control by using IBM Cloud Identity and Access Management (IAM), the ability to connect service instances to apps and service across different regions, and an easy way to view usage per group.

For instructions, see IBM Cloud: Migrating Cloud Foundry service instances to a resource group.

Secure Gateway service for Watson Studio instances in the Tokyo region

The Secure Gateway service is not yet available in the Tokyo (AP-North) service region on IBM Cloud. However, you can now provision a Secure Gateway service in any other region and use it when you create a connection in a Watson Studio instance from any region, including Tokyo.

Assign terms and tags to columns in Watson Knowledge Catalog

You can now assign business terms and tags to columns in relational data assets in catalogs. See Editing assets in catalogs.

Publish and subscribe messages to topics with the Streams operator in streams flow

In streams flows, you can now subscribe to topics with the Source opearator and publish to topics with the Target operator in the Streaming Analytics service.

Week ending 22 February 2019

Data Refinery flow runs now consume CUHs which are tracked

The capacity unit hours (CUHs) that are consumed when you run a Data Refinery flow in a Spark R 3.4 environment are now tracked.

For information, see Spark environments for Data Refinery.

Week ending 8 February 2019

Run Data Refinery flows in a Spark R environment (open beta)

You can now select a Spark R environment for your Data Refinery flows. You can select the Default Spark R 3.4 environment or you can create your own Spark R environment definition that is customized for the size of your data set. Each Data Refinery flow runs in a dedicated Spark cluster.

Select the Spark environment from the Data Refinery flow details page when you save and run the flow.

Spark environment selection in the Data Refinery flow details page

For information and instructions, see Spark environments for Data Refinery.

Week ending 1 February 2019

New navigation menu

You can easily find all your menu options in one place on the new navigation menu. Click The navigation menu icon to expand the menu.

The navigation menu

HIPAA readiness for Watson Studio and Watson Machine Learning

Watson Studio and Watson Machine Learning meet the required IBM controls that are commensurate with the Health Insurance Portability and Accountability Act of 1996 (HIPAA) Security and Privacy Rule requirements. HIPAA readiness applies to only certain plans and regions.

See HIPAA readiness and read this blog.

Week ending 25 January 2019

New SPSS modeler node

SPSS Modeler flows now support the Set Globals node to compute summary values for CLEM expressions. See SPSS Modeler nodes.

Python model operator now supports Watson Machine Learning in streams flows

In addition to loading models from IBM Cloud Object Storage, the Python model operator now also supports loading models from the Watson Machine Learning service.

Week ending 18 January 2019

SPSS Modeler flows now support more of the SPSS Modeler desktop nodes.

There are more modeling nodes. Check it out.

Week ending 11 January 2019

Easy upgrade with the Upgrade button

When you’re ready to upgrade your Watson Studio, Watson Knowledge Catalog, or Watson Machine Learning service from a Lite plan, just click the Upgrade button. You’ll be guided through the upgrade in just a few clicks. See Upgrade your Watson services.

Streams flows now support User Defined Parallelism (UDP) for Event Streams

With User Defined Parallelism, multiple workers help to increase the ingestion rate for topics with multiple partitions.

Week ending 21 December 2018

Deploy functions in IBM Watson Machine Learning

You can now deploy Python functions in Watson Machine Learning the same way that you can deploy models. Your tools and apps can use the Watson Machine Learning Python client or REST API to send data to your deployed functions the same way that they send data to deployed models.

See:

Week ending 14 December 2018

Watson Studio Desktop is generally available!

Watson Studio Desktop is now Generally Available to try and buy. It’s a new edition of Watson Studio for your desktop. Keep your data on your desktop and take advantage of Watson Studio features while you’re offline.
Read the blog
Start a free trial

Check service status for IBM Cloud

If you’re having a problem with one of your services, go to the IBM Cloud Status page. The Status page is the central place to find unplanned incidents, planned maintenance, announcements, and security bulletin notifications about key events that affect the IBM Cloud platform, infrastructure, and major services. See IBM Cloud services status.

A new tool for building neural networks: NeuNetS (Beta)

The NeuNetS tool in Watson Studio synthesizes a neural network and trains it on your training data without you having to design or build anything by hand.

Week ending 7 December 2018

Decision Optimization in notebooks (Beta)

Decision Optimization is now available in Watson Studio with a seamless integration of the CPLEX solvers in the Python runtime environment. When you create a notebook, choose the Default Python 3.5 XS - Beta of DO environment and the Decision Optimization package is pre-installed. Read the blog post.

Annotate and label image files with DefinedCrowd (Beta)

You can now use DefinedCrowd to annotate and label image files. See Annotate data with a crowd annotation platform and Annotate data with DefinedCrowd.

Migrate projects that use Object Storage OpenStack Swift

IBM Cloud is deprecating Object Storage OpenStack Swift. You must delete your projects that are associated with Object Storage OpenStack Swift or migrate them to Watson projects that are associated with IBM Cloud Object Storage.

You can tell if your project is associated Object Storage OpenStack Swift on the My Projects page by looking at the STORAGE TYPE column.

Week ending 30 November 2018

General availability for streams flows

Streams flows are now generally available. Read this blog to find out more. Also see Overview of streams flows.

Choose your IBM Cloud service region

Watson Studio and Watson Knowledge Catalog services are available in multiple IBM Cloud service regions. When you sign up for Watson Studio and Watson Knowledge Catalog, your current region is selected by default. You can now select a different region. See Sign up.

With some offering plans, you can provision services in more than one region. You can now see in which service regions you have Watson Studio and Watson Knowledge Catalog services and switch to services in a different region. Click your avatar and then Change region. You’ll see an offering plan name for any instances of Watson Studio and Watson Knowledge Catalog services that you can access across the service regions. You can select a different region so that you can access the projects, catalogs, and data that you saved in that region. See Switch service region.

Week ending 23 November 2018

Removal of the Tools menu

The Tools menu is removed from Watson Studio. You access tools within a project. To access the notebook editor, the Modeler canvas, or Data Refinery, create a notebook, modeler flow, or Data Refinery flow asset. Click Add to project and then the asset type. To access RStudio, click Launch IDE > RStudio.

Annotate and label image files with Figure Eight (Beta)

You can now use Figure Eight to annotate and label image files. See Annotate data with a crowd annotation platform and Annotate data with Figure Eight.

Week ending 16 November 2018

Project Assets page improvement

The Assets page in your projects now shows only the Data assets category by default. Other asset type tables appear after you add an asset of that type. Click Add to project to add new types of assets.

Week ending 9 November 2018

Provision Watson services in the Tokyo (AP-North) region

You can now provision Watson Studio, Watson Knowledge Catalog, and Watson Machine Learning in the Tokyo (AP-North) service region in IBM Cloud.

The Tokyo (AP-North) region has the following limitations for these services:

  • If you need a Spark runtime, you must use the Spark environment in Watson Studio for the model builder, modeler flow, and notebook editor tools. The Apache Spark service is not available in the AP-North region.
  • The Real-time Streaming Predictions deployment type is not yet available.
  • Deep learning is not yet available. You can’t create deep learning notebooks or deep learning experiments.
  • The Neural Network Modeler is not yet available.
  • The Total Catalog Assets information is not yet available.
  • Profiling of data assets that contain unstructured textual data is not yet available.
  • Multibyte characters are not supported in user-input fields in Data Refinery. Some fields allow multibyte characters but do not display them correctly.
  • Activity Tracker events for services provisioned in the Tokyo region are shown in the Activity Tracker service in the Sydney region.

Adding asset types is easier

You no longer need to enable a tool on the project Settings page to add analytic assets and access their associated tools. All analytic assets are listed on the Add to project menu. By default, the project Assets page shows only the Data assets section. As you add analytic assets, the appropriate sections appear.

SPSS Model operator (Processing and Analytics) for streams flows

In the Properties pane of the SPSS Model operator, all models of all Watson Machine Learning instances are listed. Previously, only models of a selected Watson Machine Learning instance were shown.

You can also see the list of all SPSS models in the Assets tab in the canvas palette. Like other operators and connections, you can drag the model from the palette to the canvas, and integrate it into the streams flow. See SPSS model. image of model assets tab

Week ending 2 November 2018

Manage authorized users for Watson Studio

You can now change the number of authorized users for your Watson Studio account if you have the Standard or Enterprise plan. Authorized users are project collaborators with the Admin or Editor role. You are billed extra for authorized users when they exceed the number set by your offering plan. Choose Manage > Billing and Usage > Authorized Users. See Manage authorized users.

“Streams Designer” renamed to “streams flow”

“Streams Designer” is renamed to “streams flow” to be more consistent with other types of flows, for example, modeler flows or Data Refinery flows. To create a streams flow, from within a project, click Add to project > Streams flow. See Overview of a streams flow.

Week ending 26 October 2018

Redesigned home page

The home page you see when you log in to Watson Studio or Watson Knowledge Catalog is redesigned to help you get started quicker. Take a tour! Choose Support > Launch Tour.

RStudio is no longer available in older projects with Object Storage OpenStack Swift

When you launch RStudio within Watson Studio, you must now choose a project. You can’t choose a project that uses Object Storage OpenStack Swift. You can choose only projects that use IBM Cloud Object Storage. Check the Storage section on the project Settings page to see which type of object storage the project uses.

SPSS Model operator in streams flows (Processing and Analytic)

Use the SPSS Model operator in a streams flow to run a predictive model that was created in IBM Watson Machine Learning. A predictive model refers to the prepared scoring branch of an SPSS modeler flow in Watson Machine Learning.

See SPSS Model operator.

Week ending 19 October 2018

“Data flow” renamed to “Data Refinery flow”

Data flows are now called Data Refinery flows to better distinguish them from other kinds of flows in the Watson Studio user interface. For example, modeler flows or machine learning flows.

To create a Data Refinery flow, from the Projects page, go to Add to project > Data Refinery flow.

New population pyramid chart in Data Refinery Visualizations

Population pyramid charts show the frequency distribution of a variable across categories. They are typically used to show changes in demographic data.

Population pyramid chart

To access the charts in Data Refinery, click the Visualizations tab, and then select the columns to visualize.

RStudio in Watson Studio projects in the US South and United Kingdom regions

RStudio is now integrated in IBM Watson Studio projects in the US South and United Kingdom regions and can be launched after you create a project. When you open RStudio, a default RStudio Spark environment runtime is automatically activated. With RStudio integration in projects, you can access and use the data files in the IBM Cloud Object Storage bucket associated with your project in RStudio. See RStudio overview.

Add access groups for collaborators in catalogs

You can add an IBM Cloud access group to a catalog. All the members of the access group become catalog collaborators with the role that you assign to the access group. See Add access groups.

Week ending 12 October 2018

Redact masking method

Redact is a new method of data masking for policies. You can now redact data values in asset columns, which means that data is replaced with Xs to remove information that is, for example, identifying or otherwise sensitive. With redacted data, neither the format of the data nor referential integrity is retained. See Masking data.

Streams Designer (beta)

IBM Streams Designer is back in IBM Watson Studio! We’ve developed quite a few new operators and have added functionality to improve your streaming experience.

MQTT operator (Source, Target)

Use the MQTT operator to stream messages. MQTT is a publish and subscribe messaging transport protocol that is designed to push messages to clients.

You must have your own MQTT broker.

For details, see MQTT.

Debug operator (Target)

Use the Debug operator to view the tuples coming from a selected operator. No data is stored.

Cloudant operator (Target)

Use the Cloudant operator to store data as documents in an IBM Cloudant database. Data is stored in JSON format.

Cloud Function operator (Target, Processing and Analytic)

Use the Cloud Function operator to process streaming data and make your server-less functions react to incoming events. You can define multiple parallel workers to increase processing rate.

Python Machine Learning operator (Processing and Analytic)

Use the Python Machine Learning operator to run Python models that do real-time predictions and scoring.

For details, see Python Machine Learning.

Db2 Warehouse on Cloud operator (Target)

Use this operator to store data in Db2 Warehouse on Cloud.

Email operator (Alerts)

Use this operator to send email to selected recipients. You can embed tuple fields, based on the schema, in the Subject and Body of the email.

Downloading log files

You can add log messages to the Code operators and to the Python Machine Learning operator. Those messages are sent to a log file that you can download from the Notifications pane of the Metrics page while the streams job is running.

For details, see Downloading the user log.

Installing Python packages

In addition to the supported and preinstalled packages, your streams flow might need other packages for specific work. For these cases, you can install Python packages that are managed by the pip package management system.

By default, pip installs the latest version of a package, but you can install other versions.

For details, see Installing other Python packages.

Support for container-based Streaming Analytics instances

Streams Designer now supports only container-based plans of the Streaming Analytics service. VM service plans are not supported.

Performance improvements

The Cloud Object Storage operator and Code operators have been optimized to show improvements in performance.

Deprecated

Geofence example streams flow and the geofence operator are no longer supported.

Week ending 5 October 2018

Annotate data with DefinedCrowd (beta)

You can now use DefinedCrowd to improve the quality of your training data. By involving human judgement offered through DefinedCrowd to interpret text sentiment, you can improve your model input data and raise the model confidence. See Annotate data with a crowd annotation platform and Annotate data with Defined Crowd.

Week ending 28 September 2018

Watson Knowledge Catalog Standard plan

The new Watson Knowledge Catalog Standard plan fits between the no-cost Lite plan and the Professional plan, which is an enterprise version with additional capabilities and entitlements. Use the Standard plan while you set up your first catalog and policies.

The Standard plan includes business glossary terms, policies, data lineage, and integration with IBM InfoSphere Information Governance Catalog. The integration with Information Governance Catalog allows clients to seamlessly synchronize metadata between Watson Knowledge Catalog and Information Governance Catalog. The Standard plan includes 500 capacity unit hours and the ability to purchase more to process data preparation flows and profiling activities. Read the blog about Watson Knowledge Catalog plan changes.

See Offering plans for details.

Changes to the Watson Knowledge Catalog Lite plan

The Watson Knowledge Catalog Lite plan is updated with these changes:

  • You can now create one rule that you can use in policies and five business terms.
  • You can no longer view the lineage of assets in catalogs or projects.
  • You cannot make new connections to Dropbox, Tableau, Db2 for z/OS, and Looker data sources. Your existing connections to these data sources remain.
  • The number of assets and collaborators in your catalog is now limited to 50. Your existing assets and catalog users remain unchanged. However, if you have more than 50 assets or collaborators, you won’t be able to add more until you reduce the numbers to below 50, or you upgrade your plan.

See Offering plans.

Data Refinery: Another place to schedule data flows

You can now add a schedule for your data flow from the data flow’s Summary page. In the Runs section, click the Schedule tab for the options.

New Schedule button

New operation in Data Refinery for summary calculations

Use the new Aggregate GUI operation to apply summary calculations to the values of a column. You can group the results by the values in a different column. Previously, aggregate functions were only available as code operations. The Aggregate operation is under the ORGANIZE category. For information, see GUI operations in Data Refinery.

Annotate data with Figure Eight (Beta)

You can now use Figure Eight to improve the quality of your training data. By involving human judgement offered through Figure Eight to interpret mood, intention, or tone for example, you can improve your model input data and raise the model confidence. See Annotate data with a crowd annotation platform and Annotate data with Figure Eight.

Provision Watson services in the Germany region

You can now provision Watson Studio, Watson Knowledge Catalog, and Watson Machine Learning in the Germany service region in IBM Cloud.

The Germany region has the following limitations for Watson services:

  • If you need a Spark runtime, you must use the Spark environment in Watson Studio for the model builder, modeler flow, and notebook editor tools. The Apache Spark service is not available in the Germany region.
  • The Batch Prediction and Real-time Streaming Predictions deployment types are not yet available.
  • Deep learning is not yet available. You can’t create deep learning notebooks or deep learning experiments.
  • The Neural Network Modeler is not yet available.
  • The Usage Statistics page for catalogs is not yet available.
  • The Total Catalog Assets information is not yet available.
  • Profiling of data assets that contain unstructured textual data is not yet available.

RStudio in Watson Studio projects in the Germany region

RStudio is now integrated in IBM Watson Studio projects in the Germany region and can be launched after you create a project. When you open RStudio, a default RStudio Spark environment runtime is automatically activated. With RStudio integration in projects, you can access and use the data files in the IBM Cloud Object Storage bucket associated with your project in RStudio. See RStudio overview.

Week ending 21 September 2018

New options on the Manage menu to replace the Admin Console

The administrative tasks that you previous performed on the Admin Console are now on separate pages that you access from the Manage menu.

You must be the owner or an administrator of the IBM Cloud account for your Watson services to perform these tasks:

  • Choose Manage > Storage Delegation to configure IBM Cloud Object Storage to enable non-administrative users to create projects and catalogs and to use your own encryption key.
  • Choose Manage > Environment Runtimes to view and manage the active environment runtimes.
  • Choose Manage > Catalogs to view statistics for all catalogs and assign Watson Knowledge Catalog service administrators.

See Manage your Watson services.

IBM Message Hub is renamed to Event Streams

As of September 17, 2018, IBM Message Hub is now Event Streams. The name change will not affect how you experience IBM Message Hub; however, it will improve naming consistency across the IBM Cloud catalog of services.

New and improved visualization charts in Data Refinery

Data Refinery introduces new visualization charts with a new user interface that give you more views of your data to better help you explore your data as you refine it. No syntax required.

View of four charts

Enhancements include:

  • Many more charts!: We have 21 charts out of the box and more coming soon. New charts include the 3D chart that displays data in a 3-D coordinate system by drawing each column as a cuboid. You can even rotate the view. The t-SNE chart is useful for embedding high-dimensional data into a space of two or three dimensions, which can then be visualized in a scatter plot. For the list of charts see Visualize your data.

  • Charts are interactive: Within the same chart, use sliders and settings to view the data in different ways. For example, view a pie chart in a rose or ring format. In the heat map chart, adjust the category order by “As read,” “Ascending,” or “Descending.” Hover over the information to zoom in on the values.

  • Charts are customizable: Select a global color scheme for all the charts.

  • Charts are downloadable: Download a chart as an image with annotated details such as a title and secondary title.

Access the charts in the same way as before. Click the Visualizations tab in Data Refinery, and then select the columns to visualize. The suggested charts for the data are indicated with dots next to the chart type. The chart automatically updates as you refine the data.

New operation in Data Refinery for extracting date or time values

Use the Extract date or time value operation to extract a portion of a date or a timestamp value so that you can use that value for further data refining. The Extract date or time value operation is under the CLEANSE category.

Week ending 14 September 2018

Watson services don’t require an organization

You no longer need to specify an organization when you sign up for Watson Studio or Watson Knowledge Catalog and the organization information is removed from the Profile menu and your Profile page. See Sign up.

Support for custom components in AI models

You can now define your own transformers, estimators, functions, operations, classes, and tensors in models you train on IBM Watson Machine Learning for online deployment. See: Requirements for using custom-defined components in your models

Upload individual images for Visual Recognition

You can now upload individual training images one at a time for use with the Visual Recognition model building tool. See: Preparing images for training a custom model

Add service IDs to projects

You can now add service IDs that you created in your IBM Cloud account as collaborators to your projects. See Project collaborators.

Week ending 07 September 2018

General availability for Spark environments for notebooks

Spark environments for notebooks are now GA. With Spark environments, you can define the hardware and software configurations to start custom Spark clusters on demand. Spark environments can be quickly scaled up or down for resources. This makes them well suited for a variety of use cases from trying out new machine learning algorithms on sample data to running large production workloads on the distributed computation engine. See Spark environments.

Publish a dashboard to a catalog

You can now publish a dashboard to a catalog so it can be added into other projects. When you publish the dashboard, you can include a preview of the dashboard so that users can see what the dashboard looks like before adding it into another project.

Watson Studio supports globally provisioned services

You can now associate services with Watson Studio using a combination of resource groups, organizations, and regions. You are no longer restricted to using service instances provisioned in the same region as your Watson Studio instance.

Week ending 31 August 2018

Refresh the preview of connected data

You can now see when the data in the preview was last fetched from the connection and refresh the preview data in projects and catalogs by clicking Refresh.

Week ending 24 August 2018

Db2 Big SQL connector available

Projects and catalogs now support connections to Db2 Big SQL, enabling you to store and retrieve catalog data there.

Spark environments for the model builder and Spark modeler flows (Beta)

Apache Spark environments are available not only for notebooks, which need to run with Spark, but also for the model builder and Spark modeler flows. Instead of using a Spark service in Watson Studio, you can now create Spark environments in which you define the hardware and software configuration of the runtimes that you want your tools to use. See Spark environments.

Week ending 17 August 2018

General availability for Natural Language Classifier in Watson Studio

The Natural Language Classifier tooling in Watson Studio is now GA. While maintaining existing API functionality, you can train and test your classifiers within Watson Studio.

Learn more:

Partitioned data assets

You can now create a connected data asset from a set of partitioned data files that are in a single folder in IBM Cloud Object Storage. Partitioned data assets have previews and profiles and can be masked like relational tables. However, you cannot yet shape and cleanse partitioned data assets with the Data Refinery tool.

Week ending 10 August 2018

Looker connector

Projects and catalogs now support connections from the Looker platform. See Connection types for configuration instructions.

Tableau connector

Projects and catalogs now support connections to Tableau, enabling you to store and retrieve catalog data there.

Week ending 3 August 2018

Update Natural Language Classifier classifiers trained outside of Watson Studio

You can now update older Natural Language Classifier classifiers by associating the classifiers with a project in Watson Studio and importing the training data. See Managing classifiers with Watson Studio.

Lineage for data assets

You can now see the history of the events performed on data assets from files and connected data assets in Watson Studio and Watson Knowledge Catalog in a lineage graph. Only assets that are created on or after 20 July, 2018, have lineage graphs. See Lineage.

Spark environments (Beta)

Apache Spark environments are available in beta. Instead of using a Spark service in Watson Studio, you can use the Spark engine that is available by default for all Watson Studio users. See Spark environments.

Week ending 27 July 2018

Microsoft Azure Data Lake Store connector

Projects and catalogs now support connections to Microsoft Azure Data Lake Store, enabling you to store and retrieve catalog data there. See Connection types for configuration instructions.

Download the log for a data flow

You can now download the log for each data flow run. Go to the project > Assets tab > Data flows section and click the data flow run. Select View log from the run’s menu, and then click Download. The log file name is the name of the data flow appended by the date and time (24-hour system) when the data flow was run. Invalid characters for file names are changed to an underscore (_).

Week ending 20 July 2018

Lineage for Watson Machine Learning models

You can now see the history of the events performed on Watson Machine Learning model assets in Watson Studio, Watson Knowledge Catalog, and Watson Machine Learning in a lineage graph. Only assets that are created on or after 20 July, 2018, have lineage graphs. See Lineage.

Automatically synchronize assets with Information Governance Catalog

You can configure automatic synchronization of data assets between an Information Governance Catalog and a Watson Knowledge Catalog catalog. After the initial synchronization, subsequent changes to data assets in either catalog are propagated to the other catalog. You must configure an Information Governance Catalog - Watson Knowledge Catalog Connector to communicate between the two catalogs.

New “Conditional replace” operation in Data Refinery

Use the Conditional replace operation to replace the values in a column based on conditions.

Additional support for date and timestamp data types in Data Refinery

  • GUI operations: The Replace missing values operation and the Calculate operation can now be used on date and timestamp columns.
  • Coding operations: Many of the commands are enabled for date and time data types. You can use the command-line help and the customized templates to ensure that your command syntax is supported. For an overview of the commands, see Interactive code templates in Data Refinery.

Week ending 13 July 2018

Unicode characters in asset names and descriptions

You can now use any Unicode character, except control characters, in asset names and descriptions in projects and catalogs.

Now in the UK South service region: deep learning

Deep learning features make it possible to train complex neural networks on large data sets, using GPUs and distributed training.

Tools now available in Watson Studio web interface:

  • Neural network palette in the flow editor
  • Experiment builder

Actions now available with the command line interface and API:

  • Running one or more training runs
  • Running experiments

Explore deep learning features through these tutorials: MNIST tutorials

Catalog admins can change the owners of public assets

A catalog administrator no longer needs to be a member of a public asset to change the owner of the asset to another asset member. See controlling access to an asset.

View the name of who initiated the data flow in Data Refinery

In the Runs section of the data flow details page, the History tab displays detailed information about each data flow. Previously, it displayed the email address of the person who initiated each flow. Now it displays the name of the person who initiated each flow.

Filter multiple columns in Data Refinery

You can now specify conditions for multiple columns in a Filter operation. Previously, you could only select one column. Also within the Filter operation, you can now use the operators “Is equal to” and “Is not equal to” on the values in a column.

Change the source of a data flow

You can now change the source of a saved data flow. This enhancement means that you can run the same data flow but with a different source data asset. The new data set must have have a compatible schema to the original data set (for example, column names, number of columns, and data types). Go to the project > Assets tab > Data flows section and click the data flow. In the data flow summary page, click the “Change the source data asset” icon (Change source) to select a different data source.

Week ending 29 June 2018

Natural language classification (Beta)

You can now try the beta version of a new, graphical tool in Watson Studio for building natural language classifiers.

Building a classifier is fast and easy: Upload training text examples in a .csv file, and then click Train.

See Overview: Natural Language Classifier in Watson Studio.

Account users need IAM access

IBM Cloud services are transitioning from using Cloud Foundry organizations for access control to using Identity and Access management (IAM).

When you add users to your IBM Cloud account, you must now assign IAM Editor access to non-administrative users as well as adding them to a Cloud Foundry organization. See Set up additional account users.

If you previously added users to your IBM Cloud account without assigning IAM access, your users might not be able to provision instances of some services, for example, Visual Recognition. To allow users to provision service instances that use IAM, assign them IAM Editor access.

Week ending 22 June 2018

Data Refinery enhancements to the Filter operation for date and timestamp columns

Use the Filter operation on date and timestamp columns with the following new operators:

  • Is empty
  • Is equal to
  • Is greater than
  • Is greater than equal to
  • Is less than
  • Is less than or equal to
  • Is not empty
  • Is not equal to

Google BigQuery connector

Projects and catalogs now support connections to Google BigQuery, enabling you to store and retrieve data there.

Dashboards support connected data

You can now use a connected data asset in your project as a data source to create a dashboard.

Reuse dashboards for similar data assets

You can now link an existing dashboard with a different data asset, as long as the column names and data types are the same as the original data asset.

Week ending 15 June 2018

Publish trained models to a catalog

You can now publish a trained model from a project into a catalog so that other users in your organization can view the model details and deploy the model. See Publish an asset from a project to a catalog.

To deploy a model from a catalog, first add it to a project. See Adding catalog assets to a project.

Download files from folder assets

You can now download files that you access from within a folder asset. The IBM Cloud Object Storage connection asset that is associated with the folder asset must include an Access Key and a Secret Key for the download to succeed. See Add a folder asset to a project and Add a folder asset to a catalog.

Export and import dashboards

You can now export a dashboard by downloading it as a JSON file to your local system. You can then import that dashboard to a different project by uploading the JSON file on the New dashboard screen. When you open an imported dashboard, you are notified if any of the necessary data assets are missing. After you add the data assets to the project, reopen the dashboard and relink the data assets. See Analytics dashboard.

Data Refinery enhancements

Enhancements for handling unstructured text:

  • Use the new Tokenize operation to break up English text into words, sentences, paragraphs, lines, characters, or by regular expression boundaries.
  • Use the new Remove stop words operation to remove English stop words.
  • The Filter operation has new operators that support text and regular expression patterns: Contains, Does not contain, Starts with, Does not start with, Ends with, Does not end with.
  • Use the Sample operations to specify a random sampling step, or automatic or manual stratified sampling based on a step in the flow. Sampling steps from UI operations apply only when the flow is run.

Enhancements to coding operations:

  • Use the new random sampling coding operations (sample_n and sample_frac) to show the result of the sampling in the interactive Refinery tool and apply sampling when the flow is run.
  • Use the mutate coding operation, mutate(provide_new_column = n()), to add a template for counting rows.

Week ending 8 June 2018

Create folder assets in projects and catalogs

You can now create a folder asset based on a path within an IBM Cloud Object Storage system that is accessed through a connection. You can view the files and subfolders that share the path with the folder asset. The files that you can view within the folder asset are not themselves data assets. For example, you can create a folder asset for a path that contains news feeds that are continuously updated. You can preview the contents of files in the folder asset and use the Data Refinery tool to manipulate the data in the files.

Files in folder assets are subject only to policies that operate on the folder asset. policies cannot operate directly on files in folder assets.

See Add a folder asset to a project and Add a folder asset to a catalog.

Encrypt your IBM Cloud Object Storage instance with your own key

You can now encrypt the Cloud Object Storage instance that you use for projects and catalogs with your own key. You must also have an instance of the IBM Key Project service. See Configure Cloud Object Storage for project and catalog creation.

The website addresses that you add in asset descriptions in projects and catalogs are now active hyperlinks.

Unicode characters in asset names

You can now use any Unicode character in asset names in projects and catalogs, except control characters.

Enforced policy chart on the data dashboard in Watson Knowledge Catalog

The Policy enforcements over time section on the data dashboard now shows a line chart for enforced policies. You can select the time span and granularity to display the information on a daily or monthly basis.

Discover data assets from Oracle and Apache Hive

You can discover assets from connections to Oracle and Apache Hive data sources.

Week ending 1 June 2018

Watson Studio

The following new feature is specific to Watson Studio.

Core ML support

After you train your IBM Watson Machine Learning model, you can now download a Core ML (.mlmodel) file to build into your iOS apps. See:

Week ending 25 May 2018

Watson

The following new features are included in all Watson services.

Data refining

New operators for Filter operation
The Filter operation in the FREQUENTLY USED category now supports these additional operators:

Operator Numeric String Boolean
Contains    
Does not contain    
Ends with    
Does not end with    
Starts with    
Does not start with    

Watson Studio

The following new feature is specific to Watson Studio.

Multiple instances of Watson Studio instances in an IBM Cloud account

You can now provision multiple instances of the Watson Studio services in an IBM Cloud account.

Week ending 18 May 2018

Watson

The following new features are included in all Watson services.

Data refining

Data flow output pane changes
The Data flow output pane is now in display mode by default. If you want to change any of the output details, click Edit Output to put the pane into edit mode. After you save your changes, the pane is returned to display mode.

Enhanced target options for connected data assets
Data Refinery now supports the same target options for connected data assets that it supports for the underlying connections. For example, it supports the same target options for both connected relational data assets and for connections to tables in relational databases. This includes the options for impacting an existing data set (Overwrite, Recreate, Insert, Update, Upsert). As another example, Data Refinery supports the same target options for both file-based connected data assets and for connections to files.

Week ending 11 May 2018

Watson

The following new features are included in all Watson services.

Data refining

Names of data flow run initiators now displayed
When you view the log for a data flow run, you’ll now see the name (instead of a unique ID) of the user who initiated the run at the top of the log.

Template-level command line assistance
The command line has new template-level help. After selecting an operation, simply click the operation name and select a syntax template. Use the template and the content assist to quickly and easily create a customized operation that you can apply to your data.

Project readme file

You can now document your project in a readme file using standard Markdown formatting. The readme is at the bottom of the project Overview page for all new or existing projects that use IBM Cloud Object Storage. Legacy projects won’t have readme files. See Overview page.

Week ending 4 May 2018

Watson

The following new features are included in all Watson services.

Data refining

Data flow run initiators added to logs
When you view the log for a data flow run, you’ll now see the unique ID of the user who initiated the run at the top of the log.

Week ending 27 April 2018

Watson

The following new features are included in all Watson services.

Data refining

New Substring operation
The Substring operation in the Text category can create substrings from column values. You simply indicate the starting position within the text and the length of each substring. As with many operations, you can overwrite the current column values or you can create a new column to hold the substrings.

Browsing lots of data flow runs
You can now easily browse a large number of data flow runs on the data flow details page. If there are more runs on the History tab than are currently visible, just click the new Show More button to see more runs.

Watson Studio

The following new feature is specific to Watson Studio.

Project-lib for R save_data function

You can now save data to the object storage associated with your project by using the project-lib library for R. See Project-lib for R.

Watson Knowledge Catalog

The following new feature is specific to Watson Knowledge Catalog.

Easily specify IBM Cloud account emails

When specifying the owner of a business term or defining conditions that require user IDs in the rule builder, start typing the name or email address of a user in your IBM Cloud account. You can then choose the account email from a selection list.

Week ending 20 April 2018

Watson

The following new features are included in all Watson services.

Watson Analytics connector: support for Sydney data center

The Watson Analytics connector now provides support for the Sydney data center. When you create a new target connection to Watson Analytics, you can select AP1-Sydney as the data center in the Connection details section.

Microsoft Azure SQL Database connector: support for secure gateway

The Microsoft Azure SQL Database connector now provides support for the secure gateway. When you create a new connection to Microsoft Azure SQL Database, you can select the Use a secure gateway option in the Connection details section.

Watson Knowledge Catalog

The following new features are specific to Watson Knowledge Catalog.

Business terms restricted to one owner

You can now select only one owner for a business term in the business glossary. One or several names are no longer supported and will be removed. The owner must be an IBM Cloud registered user. If you entered a name or a non-registered email address, these entries will be removed.

View and download asset files from a catalog

You can now view and download files that are associated with assets from a catalog. For example, if you uploaded a data file or a PDF file as a data asset, catalog collaborators can download the file from the asset Overview page.

Week ending 13 April 2018

Watson

The following new features are included in all Watson services.

Changes to top-level menus

To be more consistent with IBM Cloud, some menus and menu items in the header are moved or new:

  • You can now switch your account or organization from your avatar menu.
  • You now access administrative pages from the new Manage menu. The Manage menu also has options to manage your IBM Cloud account.
  • You now access the FAQ, the What’s New blog entries, and give feedback on the new Support menu.
  • You now access the Watson Studio and Watson Knowledge Catalog documentation by clicking the Docs button instead of an icon.

Watson Knowledge Catalog

The following new feature is specific to Watson Knowledge Catalog.

Mask sensitive data in columns

You can now protect sensitive data while allowing access to the rest of the data asset. You can create policies that mask sensitive values for a column in a data asset when users view the asset in a catalog or work with the asset in a project. See Masking data.

Week ending 6 April 2018

Watson

The following new features are included in all Watson services.

Data refining

Browsing lots of data assets and connections
You can now browse a large number of data assets and connections. When selecting a data asset or selecting data from a connection, you’ll see new Show More buttons at the bottom of the Data Asset tab and the Connections tab. To see more assets or connections than are currently visible, just click this button to get another page of items.

Timestamp support
The Convert column type operation in the FREQUENTLY USED category now supports String to Timestamp conversions and Timestamp to String conversions.

Enhanced date support
When converting a column from String to Date, you no longer need to ensure that the column data is in MM/DD/YYYY or MM-DD-YYYY format. You’ll now be prompted to select the current order of the month, day, and year in the date values.

Watson Studio

The following new features are specific to Watson Studio.

IBM Watson Visual Recognition

Enhanced face model GA The enhanced face detection model is now GA! This enhanced model includes reduced bias, increased accuracy of facial detection for age and gender, and tighter age ranges. Existing users of the GA /v3/detect_faces endpoint will not have to do anything, as the enhancements will be automatic. Users of the beta endpoint will need to change their requests to point to the GA endpoint by May 17, 2018, as the beta endpoint will be deprecated and no longer accessible. Read more.

Week ending 30 March 2018

Watson

Enhanced target connection support

The following connectors, which supported only source connections in the past, now support target connections too:

  • IBM services
    • Compose for PostgreSQL
    • Informix
  • Third-party services
    • Amazon Redshift
    • Pivotal Greenplum

Watson Studio

Notebook creation with Apache Spark

When you create a notebook using Apache Spark, you can only run the notebook in Spark 2.1. Spark versions 1.6 and 2.0 are no longer available for selection during notebook creation.

Week ending 23 March 2018

Watson Knowledge Catalog

Preview Microsoft Excel documents

You can now see the contents of Microsoft Excel documents that you add to a catalog on the asset’s Overview page.

Data Refinery

Enhanced date support

The Convert column type operation in the FREQUENTLY USED category now supports String to Date and Date to String conversions. When converting a column from String to Date, ensure that the column data is in either MM/DD/YYYY or MM-DD-YYYY format.

Week of 20 March 2018

Watson

New names for the Data Science Experience and Data Catalog services!

The new names better align with new AI features:

  • Data Science Experience is now named Watson Studio.
  • Data Catalog is now named Watson Knowledge Catalog.

Machine learning and AI

Image classification with Visual Recognition

You can now use IBM Watson Visual Recognition within Watson Studio to classify images. Visual Recognition uses deep learning algorithms to analyze images for scenes, objects, faces, and other content. You use the Visual Recognition model builder tool to quickly and easily train and test custom models. See Visual Recognition overview.

Deep learning

You can now use deep learning techniques to train thousands of models to identify the right combination of data plus hyperparameters that optimize the performance of your neural networks. You can run more experiments faster. You can train deeper networks and explore broader hyperparameters spaces. Watson Machine Learning accelerates this interactive cycle by simplifying the process to train models in parallel with an on-demand GPU compute cluster. See Deep learning.

You can use the Experiment Builder tool to define training runs for your experiment and automatically optimize hyperparameters. See Experiment Builder.

You can use the neural network designer tool to create deep learning flows. Design deep models for the following types of data: image (CNN architecture), as well as text and audio data (RNN architecture). The neural network designer supports 31 types of layers. Any architecture that can be designed using the combination of these 31 layers, can be designed by using the flow modeler and then publish it as a training definition file. See Neural network designer.

Modeler flows

You can now create a machine learning flow, which is a graphical representation of a data model, or a deep learning flow, which is a graphical representation of a neural network design, by using the Flow Editor. Use it to prepare or shape data, train or deploy a model, or transform data and export it back to a database table or file in IBM Cloud Object Storage.

Watson Studio

Create a project with tools specific to your needs

When you create the project, you can now choose the project tile that fits your needs. The tile selection affects the type of assets you can add to the project, the tools you can use, and the IBM Cloud services you need.

You can choose from these tiles when you create a project from the Watson Studio home page:

  • Basic: Add collaborators and data assets.
  • Complete: All tools are available. You can add services as you need them.
  • Data Preparation: Cleanse and shape data.
  • Jupyter notebooks: Analyze data with Jupyter notebooks or RStudio.
  • Experiment Builder: Develop neural networks and test them in deep learning experiments.
  • Modeler: Build, train, test, and deploy machine learning models.
  • Streams Designer: Ingest streaming data.
  • Visual Recognition: Classify images.

If you create a project from the My Projects page, your project has all tools.

After you create the project, you can add or remove tools on the Settings page.

Create dashboards to visualize data without coding

With a Cognos dashboard, you can build sophisticated visualizations of your analytics results, communicate the insights that you’ve discovered in your data on the dashboard, and then share the dashboard with others. See Cognos dashboards.

Customization support for Python environments

You can customize the software configuration of the Python environments which you create. See Environments.

Watson Knowledge Catalog

Refine catalog data assets

You can now refine data assets that contain relational data after you add them to a project. Projects that you create with Watson Knowledge Catalog include the Data Refinery tool so that you can cleanse and shape data.

Profile documents with unstructured data

Data assets that contain unstructured data, such as Microsoft Word, PDF, HTML, and plain text documents, are automatically profiled by IBM Watson Natural Language Understanding to show the distribution of inferred subject categories, concepts, sentiment, and emotions for the document on the asset’s Profile page. You can also see the profile when you add the asset to a project. See Profile data assets.

Preview PDF documents

You can now see the contents of PDF documents that you add to a catalog on the asset’s Overview page.

Review and rate assets

You can now review and rate an asset, or read reviews by other users in a catalog. View the asset and go to its Reviews page to read reviews or to add a review and a rating.

You can now see recommended and highly rated assets on the Browse page of a catalog:

  • Click Watson Recommends to see the top 20 assets that are recommended for you based on attributes common to the assets that you’ve accessed.
  • Click Highly Rated to see the assets that have the highest ratings.

See Find and view assets in a catalog.

Data Refinery

Target file format

If you select a file in a connection as the target for your data flow output, you can now select one of the following formats for that file:

  • AVRO - Apache Avro
  • CSV - Comma-separated values
  • JSON - JavaScript Object Notation
  • PARQ - Apache Parquet

Week ending 9 March 2018

Watson

New style

You’ll notice color and font style changes across the Watson services and tools. These changes align with the style of IBM Cloud to provide a more consistent user experience.

New Watson Analytics connector

Projects and catalogs now support connections to IBM Watson Analytics, enabling you to store data there. (The Watson Analytics connector supports target connections only.)

Watson Studio

PixieDust 1.1.18 adds PixieDebugger

PixieDust release 1.1.8 introduces a visual Python debugger for Jupyter Notebooks: PixieDebugger. It is built as a PixieApp, and includes a source editor, local variable inspector, console output, the ability to evaluate Python expressions in the current context, breakpoints management, and a toolbar for controlling code execution. In addition to debugging traditional notebook cells, PixieDebugger also works to debug PixieApps, which is especially useful when troubleshooting issues with routes. See Release notes for 1.1.8.

Data Refinery

Basic date and time support

Some Data Refinery operations now support datetime values.

  • Convert column type (support for converting from datetime values only)
  • Remove
  • Rename
  • Sort ascending
  • Sort descending

Watch for more operations to provide this support in the future!

New Substitute operation

The Substitute operation in the FREQUENTLY USED category can obscure sensitive information from view by substituting a random string of characters for the actual data in the column.

Catalogs

Import assets from IBM InfoSphere Information Governance Catalog

You can import assets into a catalog from an Information Governance Catalog archive file. You must have the Watson Knowledge Catalog Professional plan and have the Admin role in the catalog to import Information Governance Catalog assets. See Import assets from Information Governance Catalog into a catalog.

Week ending 2 March 2018

Watson Studio

Apache Spark Service Python 3.5 notebooks now on Anaconda 5.0

The Apache Spark Service upgraded the Anaconda distribution used for Watson Studio notebook environments to Anaconda 5.0. This updated version of Anaconda forces an upgrade to libraries that will change the version for libraries previously installed in the Watson Studio notebook environment. Some libraries updated by this upgrade have changed their APIs, which might cause your existing code to throw warnings or errors.

RStudio and R version upgraded

RStudio in Watson Studio is upgraded to version 1.1.419 and R in RStudio is now version 3.4.3. See the list of many new features that you’ll be able to use with RStudio in Watson Studio: RStudio release history. You might have to update some packages to work with the new R version.

Streams Designer

Event Streams (Source) operator configuration

Until now, if a streams flow stopped and Event Streams producers continued to send messages to the topic, those messages were retained in the Event Streams queue. When the streams flow restarted, it could not go back in time and consume those lost messages.

Now, in the Properties pane of Event Streams (Source operator), you can select the Resume reading check box to start reading in the Event Streams queue from where the streams flow left off.

You can also configure Default Offset where to begin reading in the Event Streams queue when the streams flow runs for the first time, when Resume reading is not selected, or when the resumption offset is lost. You select to start reading from the latest message or from the earliest message.

User-installed Python libraries

In addition to the supported and pre-installed packages, your streams flow might need other packages for specific work. For these cases, you can install Python packages that are managed by the pip package management system. The packages are found at Python Package Index. By default, pip installs the latest version of a package, but you can install other versions.

In Streams Designer, edit the streams flow that will use the package. Click Settings, and then click Environment.

For details, see Installing other Python libraries.

Week ending 23 February 2018

Data Refinery

Snapshot view

You can see what your data looked like at any point in time by simply clicking a step in the data flow. This puts Data Refinery into snapshot view. For example, if you click the data source step, you’ll see what your data looked like before you started refining it. You can also click any operation step to see what your data looked like after that operation was applied.

New operation descriptions

Data Refinery provides a description for each operation in the Steps tab. (This replaces the R code that was previously displayed.)

Insert, edit, and delete operations in a data flow

Previously, you could delete the last operation step in a data flow. Beginning this week, you can also insert, edit, and delete any operation step in a data flow. See Data flows and steps for more information.

Cancel a data flow run

You can cancel a data flow run when it’s in progress, that is, when its status is Running. To cancel a run, select Cancel from the run’s menu on the History tab of the data flow details page.

Insert and update rows in relational database targets

If you select an existing relational database table or view as the target for your data flow output, you have a number of options for impacting the existing data set.

  • Overwrite - Drops the existing data set and recreates it with the rows in the data flow output
  • Truncate - Delete the rows in the existing data set and replace them with the rows in the data flow output
  • Insert Only (Append) - Append all rows of the data flow output to the existing data set
  • Update Only - Update rows in the existing data set with the data flow output; don’t insert any new rows
  • Upsert (Merge) - Update rows in the existing data set and append the rest of the data flow output to it

For the Update Only and Upsert (Merge) options, you’ll need to select the columns in the output data set to compare to columns in the existing data set. The output and target data sets must have the same number of columns, and the columns must have the same names and data types in both data sets.

Week ending 16 February 2018

Watson Studio

Environments for notebooks (Beta)

In this beta release of environments, you can select default Anaconda environments with different hardware and software configurations for running Jupyter notebooks. You can have more than one environment in a project and then associate these environments with your notebooks depending on the hardware and software requirements of each notebook. See Environments.

Policies

View statistics about data assets with personal or restricted information

The Data Dashboard has been extended. You can now check how many data assets contain personal or restricted data. By default, the following classifications are identified: sensitive personal information (SPI), personally identifiable information (PII), or confidential. You can also use your own business terms instead of these classifications. See Policy usage statistics.

Choose email addresses from a list in the Rule Builder

When you create a rule in the Rule Builder and need to specify email addresses, start typing and then you can choose from a list of matching email adresses.

Week ending 9 February 2018

Data Refinery

Create, edit, and delete data flow schedules

When you save or run a new data flow, you can add a one-time or repeating schedule for that data flow. You can subsequently edit or delete the schedule from Data Refinery as well.

Scheduled data flow runs are displayed on the Schedule tab of the data flow details page. Past data flow runs are displayed on the History tab of the same page.

Preview source and target data sets from the data flow details page

You view summary information for a data flow by going to the project > Assets tab > Data flows section and clicking the data flow you’re interested in. In the Summary section, you can now preview both the source and target data sets.

Watson Studio

Object Storage OpenStack Swift deprecation

When you create a project, use IBM Cloud Object Storage instead of Object Storage OpenStack Swift.

Object Storage OpenStack Swift is no longer available when you create a project if you access Watson Studio from the US-South Dallas region with the dataplatform.ibm.com URL. The Object Storage OpenStack Swift service is available until the end of March, 2018 in the United Kingdom region with the eu-gb.dataplatform.ibm.com URL. Projects with Object Storage OpenStack Swift continue to work.

Easily add Community data sets to a project and notebook

You can add a Community data set to a project by clicking the Add to project button on the data set and selecting a project. Then you can use the Insert to code function for the data set within a notebook. See Load and access data in a notebook.

New Python connector for IBM Cloud Object Storage

You can now use Python connector code in a notebook to load data from and save data to an IBM Cloud Object Storage instance. See Python connectors.

PixieDust 1.1.7 is available

PixieDust release 1.1.7 adds support for aggregate value filtering, updates table visualization, improves Brunel rendering, and has some updated icons. See Release notes for 1.1.7.

Week ending 2 February 2018

Watson

New Services menu

The Data Services menu is now the Services menu, with new options to add and manage IBM Cloud AI and compute services, as well as data services.

Streams Designer

New canvas design

Check out the new appearance of the Streams Designer canvas! It now has the same look and feel as the Watson Studio common canvas.

Take note of these changes:

  • The bottom tool bar actions (Settings, Run, Save, Metrics) were moved to the top tool bar. new_canvas_tool_bar

  • The Close button is gone. Instead, click canvas_metrics in the top tool bar to go to the Metrics page. Or, click the breadcrumbs to return to the Project page. breadcrumbs

  • Autosave is coming! In the meantime, click canvas_save to save your work.

New operators

  • Code (in Sources list of operators)

Previously, the Code operator was only a Processing and Analytics type of operator. Now, the Code operator is also available as a Source operator. This operator gives you a convenient way to generate your own sample data or to consume data from an external source.

For details, see Code operator.

  • Python Machine Learning (in Processing and Analytics list of operators)

This operator provides a simple way to run Python models of popular frameworks for real time prediction and scoring.

The Python ML operator is based on the Code operator. In addition, it can upload the model file objects from Cloud Object Storage and generate the necessary callbacks in the code.

For details, see Python Machine Learning operator.

Data Refinery

Save data flow output as a data asset

You can save data flow output as a new data asset or you can replace an existing data asset. By default, data flow output is saved as a new data asset in the project.

To specify that your data flow output be saved as an existing data asset:

  1. From the Data flow output pane, click Change Location.
  2. Select the data asset you want to replace. Note that the target name changes to the name of the existing asset.
  3. Click Save Location.

Change your column selection in the Operation pane

After you choose an operation, you can change the column that you want to apply the operation to. Just click Change Column Selection at the top of the Operation pane, select a new column, and click Save.

New progress indicator

A progress indicator is now displayed when you choose to refine a data set. The indicator provides useful information about what’s going on behind the scenes of Data Refinery.

Week ending 26 January 2018

Watson

New Teradata connector

Projects and catalogs now support connections to Teradata, enabling you to access data stored there.

Watson Studio

Any collaborator can leave a project

You can leave a project, regardless of your role in it. Previously, only collaborators with the Admin role could leave a project. See Leave a project.

Data Refinery

Data sample size

The name of the source file and the number of rows in the data sample are now displayed at the bottom of Data Refinery. (A data sample is the subset of data that’s read from the data source and visible in Data Refinery. It enables you to work quickly and efficiently while building your data flow.)

Preview data sources

When you’re selecting data to add to Data Refinery, you can now preview a data source before selecting it. Simply click the eye (eye icon) icon next to the file, table, or view that you want to preview.

Week ending 19 January 2018

Watson

See your current account

If you are a user who can access Watson services in other IBM Cloud accounts because you’ve been added as a user in those accounts, now you can quickly see which account you are logged into by clicking your profile avatar. The account shows under your user name. You can switch accounts with your avatar menu.

New Dropbox connector

Projects and catalogs now support connections to Dropbox, enabling you to access files stored there. To obtain the application token that’s needed to configure a Dropbox connection, follow the instructions in the Dropbox OAuth guide.

Policies

Edit and delete capabilities

You can now edit and delete more policy items:

  • Delete business terms: In Business Glossary, you can now delete business terms that are in draft or archived state.
  • Delete policies: In Policy Manager, you can delete draft or archived policies.
  • Edit rules: You can update rules in published policies if you have the Admin role for the Watson Knowledge Catalog service. The updated rule applies to all other published policies that contain that rule.

Governance Dashboard is renamed to Data Dashboard

The Governance Dashboard is now called Data Dashboard. If you have the necessary permissions, you can see the Data Dashboard by choosing Governance > Data Dashboard.

Watson Knowledge Catalog

Discover assets from PostgreSQL

You can now discover assets from connections to PostgreSQL data sources.

Connections to IBM Cloud Object Storage are on the Settings page

You can now see the connections to your IBM Cloud Object Storage instance on the Settings page of the catalog. The connections no longer appear in the list of catalog assets.

Mark connection assets as private

You can now mark a connection asset as private so that only the connection asset members can see and use the connection.

See policy information when assets are blocked

When an asset is blocked by policies, you now see a message that identifies the policy.

Streams Designer

Use the new Streams Designer tutorials to gain hands-on experience in designing, running, and troubleshooting your stream flows. You can watch videos or follow along the tutorial to see how easy it is to design and deploy a streams flow. See Tutorials of streams flows.

Data Refinery

Larger subsets

To enable you to work quickly and efficiently when creating a data flow, Data Refinery operates on a subset of rows in each data set. Beginning this week, the size of that subset is larger (750 KB). This enables you to see more of your data and use more data for interactive cleansing and shaping operations.

Watson Studio

PixieDust 1.1.6 is available

PixieDust release 1.1.6 updates the Bokeh version and fixes a Bokeh display problem. PixieApps now automatically collapses dropdowns. See Release notes for 1.1.6.

Week ending 12 January 2018

Watson Studio

Library to interact with project assets within notebooks

You can use the pre-installed project-lib library in Python notebooks to interact with projects and project assets. Using the project-lib library, you can access project metadata and assets, including files and connections. The library also contains functions that simplify fetching files from the object storage associated with the project. See Use project-lib to interact with projects and project assets.

Dive deeper with enhanced flow editor topics

Lists of the Modeler nodes and SparkML nodes now provide you with more detail about each of the node controls and functions. See Creating machine learning flows with SparkML nodes and Creating machine learning flows with SPSS nodes.

Watson Machine Learning topics re-engineered for the way you work

It’s a flow thing. We’ve reworked the order of topics to reflect the way data scientists are using our product. By making use of extensive feedback and leveraging the content on the IBM Cloud we’re hoping to make it easier for you to learn as you go. See Watson Machine Learning.

Watson Knowledge Catalog

Recent catalogs in the Catalogs menu

You can quickly open catalogs that you’ve accessed recently from the Catalogs menu. Previously, you chose View All Catalogs to go to the Your Catalogs page and then opened the catalog you wanted.

Week ending 5 January 2018

Watson

Important information appears in an announcement bar

If there’s an important product update or great new feature that we think you need to know about, it appears in an announcement bar at the top of the screen. You can easily dismiss announcements, and if you want to read them again, click the notification bell to see the notifications log.

Watson Studio

Invite project collaborators with an email list

It’s easier to invite multiple collaborators to a project. You can paste a list of email addresses that are separated by commas into the Invite field, instead of pressing Enter between each email address.

PixieDust 1.1.5 is available

PixieDust now properly supports Python’s string format operator: %. When you define PixieApp views, you can choose to use Markdown syntax instead of HTML. See Release notes for 1.1.5.

Watson Knowledge Catalog

Improved navigation for policies

In the Business Glossary, you can use breadcrumbs to quickly jump to previous screens.

In the Policy Manager, you can sort by policy status to find your policies within a category. By default, all published policies are now displayed first.

Week ending 22 December 2017

Watson Studio

Pixiedust 1.1.4 is available

PixieApps can now use any third-party plotting library (like Matplotlib) with a route method, and developers can now more easily create PixieApp HTML fragments with Jinja. See Release notes for 1.1.4.

Week ending 15 December 2017

Watson

Assign project and catalog creation rights in the Admin Console

Your account members don’t have to be an administrator of a Cloud Object Storage instance to create projects or catalogs. You can now specify which instances of Cloud Object Storage can be used by non-administrators on the Project & Catalog Creation page of the Admin Console.

Watson Knowledge Catalog and policies

Include more information when importing business terms

When importing business terms to the Business Glossary, you can now add the business definition and the state of the business term to the CSV file you want to import.

Sort policies

You can now sort policies contained in a category by columns, such as by name, status, date, and so on. By default, policies are sorted by status.

Data Refinery

GUI operation enhancements

  • The Search bar in the Operation list helps you quickly find the operation you’re looking for.
  • The Join operation in the ORGANIZE category can join two data sets in a variety of ways. You can perform a full join, inner join, left join, right join, semi join, or anti join. You can also select the columns you want to see in the result set, and if there are same-named columns between the two data sets, you can specify unique suffixes to differentiate them.
  • You no longer need to select a column before clicking the Operation menu. You’ll be prompted to make a selection after you choose an operation and only those columns that are appropriate for that operation are selectable.
  • A snapshot of the selected columns are shown to the right of the Operation pane so you can see the data while you fill in operation details.

Code operation enhancements

  • The command line has new operation- and function-level help to assist you in quickly and easily creating customized operations that you can apply to your data.
  • Background highlighting of command line elements provides a visual indicator that syntax, column, and function suggestions are available. Just click the elements to invoke the suggestions.
  • Coming soon… template-level help!

Data format specification

When file-based data is read into Data Refinery, if it doesn’t look like it should, click the Specify data format icon. To ensure that Data Refinery can correctly read your data, modify the data format assumptions, such as whether the first line contains column headers, what the field delimiter is, and what the quote and escape characters are.

Week ending December 8, 2017

Watson

Simplified registration

When you activate Watson services, you can now activate both the Watson services that you want in a single screen. See Activate Watson services.

Improved getting started experience

When you sign in to Watson services, the Get Started information on the landing page shows more key tasks so you can be productive faster.

Watson Studio

More control over your services

You no longer provision a Spark service and an object storage service when you sign up for Watson Studio. Instead, when you create a project you provision the object storage type that you want and you can choose whether to include a Spark service in your project. See Set up a project.

View scheduled job details

You can now view details about the scheduled jobs for running notebooks without editing the schedule. While editing the notebook, click the Schedule icon and then choose View job details. See Schedule a notebook.

Watson Knowledge Catalog

General availability for Watson Knowledge Catalog

Watson Knowledge Catalog service is now generally available (GA). Read this blog to learn how to switch your beta catalogs to a GA plan. Read this FAQ to understand what happens to your beta functionality when you switch to a GA plan.

Discover data assets from connections

You can discover assets from a connection, so that all user tables and views accessible from the connection are added as data assets to the project that you select. From the project, you can evaluate each data asset and publish the ones you want to the catalog.

You can discover assets from connections to the following data sources:

  • IBM Cloud Object Storage (IaaS)
  • IBM Cloud Object Storage
  • Db2 on Cloud
  • Db2 Warehouse on Cloud
  • Db2
  • Microsoft SQL Server
  • MySQL on Compose
  • Postgres on Compose

See Discovering data assets from a connection.

Policies

Data class groups for rules

When a data asset is added to a catalog with policies enforced, it is automatically profiled and classified as part of the policy framework. The profiling process samples the data asset and leverages different algorithms to determine the type of content in the data asset.

Automatic profiling is based on over 160 data classes provided by IBM. These data classes are categorized into 12 data class groups provided by IBM. You can now select one of these data class groups when defining rules instead of having to select individual data classes from a long list. For example, if you want to restrict access to a data set that contains personal information, you can select the data class group Personal Information, which comprises basic attributes of an individual, such as person name, date of birth, and gender. See data class groups.

Edit policies and rules

You can now edit published policies to refresh policy details and to add or delete the rules they contain. If you just want to change the name or description of a policy, hover over the information you want to update. To add or delete rules contained in a published policy click Edit to select which rules you want to delete, add, or create. See Finding and viewing a policy. To edit category details hover over the name or description of a category.

Week ending December 1, 2017

Watson Studio

Improved security: restrict project membership

When you create a project, you can now choose to restrict who can be a collaborator. If you select the Restrict who can be a collaborator checkbox, you can add only members of your IBM Cloud account to the project, or, if your company has SAML federation set up in IBM Cloud, only employees of your company.

The project must be restricted to add catalog assets. See Set up a project.

Existing projects can no longer get assets from catalogs

You can no longer add assets from a catalog to existing projects that are not restricted. However, any catalog assets that you previously added to an unrestricted project remain in the project.

View data classes for data assets

If you have Watson Knowledge Catalog, you can create a profile of a data asset to view the data classes that are inferred for each column in any data asset in a project. Click the data asset name to see the preview and then click the Profile tab. Click Create Profile to start the profiling process.

PixieDust support for Brunel

PixieDust 1.1.3 supports Brunel as an additional chart-rendering option for feature-rich interactive data visualizations. The Brunel renderer for PixieDust supports all charts types: bar, line, scatter, pie, histogram, and maps. Maps, however, support extra visualization options: heatmap, treemap, and chords. See Release notes for 1.1.3.

Watson Knowledge Catalog

Improved security: restricted membership

Catalog membership is now restricted to members of your IBM Cloud account, or, if your company has SAML federation set up in IBM Cloud, employees of your company. To add catalog assets to a project, the project must be similarly restricted. However, members of unrestricted projects can publish assets to a catalog if they are members of both and have sufficient permissions. See Manage access to a catalog.

View data classes for data assets

The profile of a data asset shows the data classes that are inferred for each column in the data set. You can see the profile when you view the asset and click the Profile tab. In catalogs with policies enforced, data asset profiles are created automatically, based on the first 5000 rows of the data set. In catalogs that do not have policies enforced, data assets are not profiled automatically. You must create a profile. See Profile data assets.

Delete connections

You can now delete connection assets from a catalog.

Streams Designer

  • Disqualified rows are shown in preview – In the Edit Schema window, you can preview the incoming events based on the defined schema of the source operator. Disqualified values will be shown in red highlight. Preview helps you in two ways:

    • You’ll get an indication of which events will be discarded when they don’t comply with the defined schema.
    • You’ll save time because you don’t need to run the streams flow in order to discover a mismatch with the schema.
  • Download logs and link to Streaming Analytics instance - Streams Designer provides a notification panel in the Metrics page which shows any compilation or runtime errors. To further enable you to debug an error, you can now download the logs from the Streaming Analytics instance. In addition, a link is also provided to the Streaming Analytics instance that is used for running of the streams flow. Shows notification links
  • Automatic restart of Streaming Analytics instance - If the Streaming Analytics instance is stopped while you’re running a streams flow, you can now automatically restart the instance in Streams Designer without having to go IBM Cloud to do so.

    If the instance cannot be started (for example, the Lite plan expired and the instance is disabled), you will receive a message with a link to the instance on IBM Cloud.

  • Indication of an ‘unhealthy’ running streams flow - When running a streams flow, Streams Designer will now indicate whether the flow is ‘unhealthy’. This tells you that there are issues with the running of the flow that should be investigated. Look for errors in the Notification panel or download the logs.

Week ending November 24, 2017

Watson Studio

Mention users in comments in notebooks

While editing a notebook, you can mention another user, who is a project collaborator, in a comment. Only that user is notified of the comment. To mention a user in a comment, enter the @ symbol and start entering the user’s name until you can choose it from the search results: for example, @joe_blue. Then Joe Blue receives a notification that you mentioned him in a comment in a notebook.

Week ending November 17, 2017

Streams Designer in open beta

Use Streams Designer to collect, curate, analyze, and act on massive amounts of changing data in real time. Regardless of whether the data is structured or unstructured, you can leverage data at scale to drive real-time analytics for up-to-the-minute business decisions. See Get started with Streams Designer.

If you participated in the streaming pipelines closed beta, here are the new features for open beta.

Streaming pipelines now has a new name – Streams Designer! The streaming pipeline capability is now called Streams Designer, and the streaming pipeline asset is called streams flow.

Streams Designer in the Tools menu

You can now create a new streams flow directly from the Tools menu. The Tools menu

You associate the new streams flow with a project in the New Streams Flow window.

Support for new IBM Cloud Object Storage (IAM support)

You can now connect and stream data to your IBM Cloud Object Storage instances by using the IBM Cloud Object Storage operator.

Full integration with connections

Streams Designer is now fully integrated with connections. You select your data source by using a connection. Within your streams flow, you can reuse the connections that you defined in the project. Existing connections can be dragged onto the canvas to create pre-connected operators to service instances. You can also create a new connection that can be used in other stream flows in the project.

New DB2 operator

Db2 Warehouse on Cloud is a fully-managed, enterprise-class, cloud data warehouse service. Use the new Db2 Warehouse operator to connect to your Db2 Warehouse on Cloud instances.

Streams Flows ‘View All’

In the Assets tab of a project’s Project page, the ‘View all’ link is shown when there are more than 10 stream flows. View All menu

Click the “View all” link to see all stream flows in table or in tile view. You also get useful information about your streams flows, such as status, running time, and more.

Data Refinery

Guided tour

A quick guided tour now introduces first-time users to concepts and features in Data Refinery.

New operations

  • The Trim quotes operation in the Text category can remove single or double quotation marks that enclose text
  • The Convert column value to missing operation in the Cleanse category can convert values in the selected column to missing values in one of two ways:
    • Column values in the selected column match values in a second, specified column
    • Column values in the selected column match a specified value

Data flow enhancements

  • You can now save data flow output to a new connection asset by selecting a folder or schema, saving the location, and then providing a new, unique name for the target data set
  • The data flow run information now includes new status icons, the number of rows read from the data source and written to the target, and the name of the user who initiated each run

Week ending November 10, 2017

Data Refinery

New and enhanced operations

  • A new Concatenate string operation in the Text category can link together any string with a column value. You can add the string to the left, right or both sides of the value.
  • The Split column operation can now split a column by using a regular expression pattern. This new method joins the text, position, and default methods already in place.
  • The Replace substring operation can now perform replacement based on a regular expression pattern, in addition to the already-supported text method.
  • The Filter operation now supports multiple conditions on a single filter. You can combine the different conditions with AND or OR operators.
  • The Replace missing values operation is now supported for string columns as well as numeric columns.

Watson Machine Learning

New and enhanced operations

  • The Flow Editor has new navigation features. For an overview of the changes, you can take the tour, which is available from within the Flow Editor work area when you create a new flow.
  • Machine Learning notebooks have been updated to include the most-recent API changes, such as the requirement to use the Bearer token for authorization calls.

Week ending November 3, 2017

New Watson services in open beta

The new Watson services Watson Knowledge Catalog and Data Refinery are now in open beta. If you have a Watson Studio account, you can try them out for free by clicking your avatar and then Add other services. Or you can sign up for any of the services.

Get started!

New type of project

The new type of project, called an Watson project, has these new features:

  • The project uses IBM Cloud Object Storage.
  • You can use the new Watson services with the project.
  • You create and edit connections within the project.

To create an Watson project, choose IBM Cloud Object Storage when you create the project. You can continue to create legacy-style projects that use Object Storage OpenStack Swift.