2017 What’s New
Here are the new features for Watson Studio, Watson Machine Learning, and Watson Knowledge Catalog for the year 2017.
Week ending 22 December 2017
Pixiedust 1.1.4 is available
PixieApps can now use any third-party plotting library (like Matplotlib) with a route method, and developers can now more easily create PixieApp HTML fragments with Jinja. See Release notes for 1.1.4.
Week ending 15 December 2017
Assign project and catalog creation rights in the Admin Console
Your account members don’t have to be an administrator of a Cloud Object Storage instance to create projects or catalogs. You can now specify which instances of Cloud Object Storage can be used by non-administrators on the Project & Catalog Creation page of the Admin Console.
Watson Knowledge Catalog and policies
Include more information when importing business terms
When importing business terms to the Business Glossary, you can now add the business definition and the state of the business term to the CSV file you want to import.
You can now sort policies contained in a category by columns, such as by name, status, date, and so on. By default, policies are sorted by status.
GUI operation enhancements
- The Search bar in the Operation list helps you quickly find the operation you’re looking for.
- The Join operation in the ORGANIZE category can join two data sets in a variety of ways. You can perform a full join, inner join, left join, right join, semi join, or anti join. You can also select the columns you want to see in the result set, and if there are same-named columns between the two data sets, you can specify unique suffixes to differentiate them.
- You no longer need to select a column before clicking the Operation menu. You’ll be prompted to make a selection after you choose an operation and only those columns that are appropriate for that operation are selectable.
- A snapshot of the selected columns are shown to the right of the Operation pane so you can see the data while you fill in operation details.
Code operation enhancements
- The command line has new operation- and function-level help to assist you in quickly and easily creating customized operations that you can apply to your data.
- Background highlighting of command line elements provides a visual indicator that syntax, column, and function suggestions are available. Just click the elements to invoke the suggestions.
- Coming soon… template-level help!
Data format specification
When file-based data is read into Data Refinery, if it doesn’t look like it should, click the Specify data format icon. To ensure that Data Refinery can correctly read your data, modify the data format assumptions, such as whether the first line contains column headers, what the field delimiter is, and what the quote and escape characters are.
Week ending December 8, 2017
When you activate Watson services, you can now activate both the Watson services that you want in a single screen. See Activate Watson services.
Improved getting started experience
When you sign in to Watson services, the Get Started information on the landing page shows more key tasks so you can be productive faster.
More control over your services
You no longer provision a Spark service and an object storage service when you sign up for Watson Studio. Instead, when you create a project you provision the object storage type that you want and you can choose whether to include a Spark service in your project. See Set up a project.
View scheduled job details
You can now view details about the scheduled jobs for running notebooks without editing the schedule. While editing the notebook, click the Schedule icon and then choose View job details. See Schedule a notebook.
Watson Knowledge Catalog
General availability for Watson Knowledge Catalog
Watson Knowledge Catalog service is now generally available (GA). Read this blog to learn how to switch your beta catalogs to a GA plan. Read this FAQ to understand what happens to your beta functionality when you switch to a GA plan.
Discover data assets from connections
You can discover assets from a connection, so that all user tables and views accessible from the connection are added as data assets to the project that you select. From the project, you can evaluate each data asset and publish the ones you want to the catalog.
You can discover assets from connections to the following data sources:
- IBM Cloud Object Storage (IaaS)
- IBM Cloud Object Storage
- Db2 on Cloud
- Db2 Warehouse on Cloud
- Microsoft SQL Server
- MySQL on Compose
- Postgres on Compose
Data class groups for rules
When a data asset is added to a catalog with policies enforced, it is automatically profiled and classified as part of the policy framework. The profiling process samples the data asset and leverages different algorithms to determine the type of content in the data asset.
Automatic profiling is based on over 160 data classes provided by IBM. These data classes are categorized into 12 data class groups provided by IBM. You can now select one of these data class groups when defining rules instead of having to select individual data classes from a long list. For example, if you want to restrict access to a data set that contains personal information, you can select the data class group Personal Information, which comprises basic attributes of an individual, such as person name, date of birth, and gender. See data class groups.
Edit policies and rules
You can now edit published policies to refresh policy details and to add or delete the rules they contain. If you just want to change the name or description of a policy, hover over the information you want to update. To add or delete rules contained in a published policy click Edit to select which rules you want to delete, add, or create. See Finding and viewing a policy. To edit category details hover over the name or description of a category.
Week ending December 1, 2017
Improved security: restrict project membership
When you create a project, you can now choose to restrict who can be a collaborator. If you select the Restrict who can be a collaborator checkbox, you can add only members of your IBM Cloud account to the project, or, if your company has SAML federation set up in IBM Cloud, only employees of your company.
The project must be restricted to add catalog assets. See Set up a project.
Existing projects can no longer get assets from catalogs
You can no longer add assets from a catalog to existing projects that are not restricted. However, any catalog assets that you previously added to an unrestricted project remain in the project.
View data classes for data assets
If you have Watson Knowledge Catalog, you can create a profile of a data asset to view the data classes that are inferred for each column in any data asset in a project. Click the data asset name to see the preview and then click the Profile tab. Click Create Profile to start the profiling process.
PixieDust support for Brunel
PixieDust 1.1.3 supports Brunel as an additional chart-rendering option for feature-rich interactive data visualizations. The Brunel renderer for PixieDust supports all charts types: bar, line, scatter, pie, histogram, and maps. Maps, however, support extra visualization options: heatmap, treemap, and chords. See Release notes for 1.1.3.
Watson Knowledge Catalog
Improved security: restricted membership
Catalog membership is now restricted to members of your IBM Cloud account, or, if your company has SAML federation set up in IBM Cloud, employees of your company. To add catalog assets to a project, the project must be similarly restricted. However, members of unrestricted projects can publish assets to a catalog if they are members of both and have sufficient permissions. See Manage access to a catalog.
View data classes for data assets
The profile of a data asset shows the data classes that are inferred for each column in the data set. You can see the profile when you view the asset and click the Profile tab. In catalogs with policies enforced, data asset profiles are created automatically, based on the first 5000 rows of the data set. In catalogs that do not have policies enforced, data assets are not profiled automatically. You must create a profile. See Profile data assets.
You can now delete connection assets from a catalog.
Disqualified rows are shown in preview – In the Edit Schema window, you can preview the incoming events based on the defined schema of the source operator. Disqualified values will be shown in red highlight. Preview helps you in two ways:
- You’ll get an indication of which events will be discarded when they don’t comply with the defined schema.
- You’ll save time because you don’t need to run the streams flow in order to discover a mismatch with the schema.
- Download logs and link to Streaming Analytics instance - Streams Designer provides a notification panel in the Metrics page which shows any compilation or runtime errors. To further enable you to debug an error, you can now download the logs from the Streaming Analytics instance. In addition, a link is also provided to the Streaming Analytics instance that is used for running of the streams flow.
Automatic restart of Streaming Analytics instance - If the Streaming Analytics instance is stopped while you’re running a streams flow, you can now automatically restart the instance in Streams Designer without having to go IBM Cloud to do so.
If the instance cannot be started (for example, the Lite plan expired and the instance is disabled), you will receive a message with a link to the instance on IBM Cloud.
- Indication of an ‘unhealthy’ running streams flow - When running a streams flow, Streams Designer will now indicate whether the flow is ‘unhealthy’. This tells you that there are issues with the running of the flow that should be investigated. Look for errors in the Notification panel or download the logs.
Week ending November 24, 2017
Mention users in comments in notebooks
While editing a notebook, you can mention another user, who is a project collaborator, in a comment. Only that user is notified of the comment. To mention a user in a comment, enter the @ symbol and start entering the user’s name until you can choose it from the search results: for example,
@joe_blue. Then Joe Blue receives a notification that you mentioned him in a comment in a notebook.
Week ending November 17, 2017
Streams Designer in open beta
Use Streams Designer to collect, curate, analyze, and act on massive amounts of changing data in real time. Regardless of whether the data is structured or unstructured, you can leverage data at scale to drive real-time analytics for up-to-the-minute business decisions. See Get started with Streams Designer.
If you participated in the streaming pipelines closed beta, here are the new features for open beta.
Streaming pipelines now has a new name – Streams Designer! The streaming pipeline capability is now called Streams Designer, and the streaming pipeline asset is called streams flow.
Streams Designer in the Tools menu
You can now create a new streams flow directly from the Tools menu.
You associate the new streams flow with a project in the New Streams Flow window.
Support for new IBM Cloud Object Storage (IAM support)
You can now connect and stream data to your IBM Cloud Object Storage instances by using the IBM Cloud Object Storage operator.
Full integration with connections
Streams Designer is now fully integrated with connections. You select your data source by using a connection. Within your streams flow, you can reuse the connections that you defined in the project. Existing connections can be dragged onto the canvas to create pre-connected operators to service instances. You can also create a new connection that can be used in other stream flows in the project.
New DB2 operator
Db2 Warehouse on Cloud is a fully-managed, enterprise-class, cloud data warehouse service. Use the new Db2 Warehouse operator to connect to your Db2 Warehouse on Cloud instances.
Streams Flows ‘View All’
In the Assets tab of a project’s Project page, the ‘View all’ link is shown when there are more than 10 stream flows.
Click the “View all” link to see all stream flows in table or in tile view. You also get useful information about your streams flows, such as status, running time, and more.
A quick guided tour now introduces first-time users to concepts and features in Data Refinery.
- The Trim quotes operation in the Text category can remove single or double quotation marks that enclose text
- The Convert column value to missing operation in the Cleanse category can convert values in the selected column to missing values in one of two ways:
- Column values in the selected column match values in a second, specified column
- Column values in the selected column match a specified value
Data flow enhancements
- You can now save data flow output to a new connection asset by selecting a folder or schema, saving the location, and then providing a new, unique name for the target data set
- The data flow run information now includes new status icons, the number of rows read from the data source and written to the target, and the name of the user who initiated each run
Week ending November 10, 2017
New and enhanced operations
- A new Concatenate string operation in the Text category can link together any string with a column value. You can add the string to the left, right or both sides of the value.
- The Split column operation can now split a column by using a regular expression pattern. This new method joins the text, position, and default methods already in place.
- The Replace substring operation can now perform replacement based on a regular expression pattern, in addition to the already-supported text method.
- The Filter operation now supports multiple conditions on a single filter. You can combine the different conditions with AND or OR operators.
- The Replace missing values operation is now supported for string columns as well as numeric columns.
Watson Machine Learning
New and enhanced operations
- The Flow Editor has new navigation features. For an overview of the changes, you can take the tour, which is available from within the Flow Editor work area when you create a new flow.
- Machine Learning notebooks have been updated to include the most-recent API changes, such as the requirement to use the
Bearertoken for authorization calls.
Week ending November 3, 2017
New Watson services in open beta
The new Watson services Watson Knowledge Catalog and Data Refinery are now in open beta. If you have a Watson Studio account, you can try them out for free by clicking your avatar and then Add other services. Or you can sign up for any of the services.
New type of project
The new type of project, called an Watson project, has these new features:
- The project uses IBM Cloud Object Storage.
- You can use the new Watson services with the project.
- You create and edit connections within the project.
To create an Watson project, choose IBM Cloud Object Storage when you create the project. You can continue to create legacy-style projects that use Object Storage OpenStack Swift.