Use custom software_spec to create statsmodels function describing data with ibm-watsonx-ai
¶
This notebook demonstrates how to deploy in Watson Machine Learning service a python function with statsmodel
which requires to create custom software specification using conda yaml file with all required libraries.
Some familiarity with bash is helpful. This notebook uses Python 3.11 with statsmodel.
Learning goals¶
The learning goals of this notebook are:
- Working with the Watson Machine Learning instance
- Creating custom software specification
- Online deployment of python function
- Scoring data using deployed function
Contents¶
This notebook contains the following parts:
1. Set up the environment¶
Before you use the sample code in this notebook, you must perform the following setup tasks:
- Create a Watson Machine Learning (WML) Service instance (a free plan is offered and information about how to create the instance can be found here).
!pip install -U ibm-watsonx-ai | tail -n 1
Connection to WML¶
Authenticate the Watson Machine Learning service on IBM Cloud. You need to provide platform api_key
and instance location
.
You can use IBM Cloud CLI to retrieve platform API Key and instance location.
API Key can be generated in the following way:
ibmcloud login
ibmcloud iam api-key-create API_KEY_NAME
In result, get the value of api_key
from the output.
Location of your WML instance can be retrieved in the following way:
ibmcloud login --apikey API_KEY -a https://cloud.ibm.com
ibmcloud resource service-instance WML_INSTANCE_NAME
In result, get the value of location
from the output.
Tip: Your Cloud API key
can be generated by going to the Users section of the Cloud console. From that page, click your name, scroll down to the API Keys section, and click Create an IBM Cloud API key. Give your key a name and click Create, then copy the created key and paste it below. You can also get a service specific url by going to the Endpoint URLs section of the Watson Machine Learning docs. You can check your instance location in your Watson Machine Learning (WML) Service instance details.
You can also get service specific apikey by going to the Service IDs section of the Cloud Console. From that page, click Create, then copy the created key and paste it below.
Action: Enter your api_key
and location
in the following cell.
api_key = 'PASTE YOUR PLATFORM API KEY HERE'
location = 'PASTE YOUR INSTANCE LOCATION HERE'
from ibm_watsonx_ai import Credentials
credentials = Credentials(
api_key=api_key,
url='https://' + location + '.ml.cloud.ibm.com'
)
from ibm_watsonx_ai import APIClient
client = APIClient(credentials)
Working with spaces¶
First, create a space that will be used for your work. If you do not have space already created, you can use Deployment Spaces Dashboard to create one.
- Click New Deployment Space
- Create an empty space
- Select Cloud Object Storage
- Select Watson Machine Learning instance and press Create
- Copy
space_id
and paste it below
Tip: You can also use SDK to prepare the space for your work. More information can be found here.
Action: Assign space ID below
space_id = 'PASTE YOUR SPACE ID HERE'
You can use list
method to print all existing spaces.
client.spaces.list(limit=10)
To be able to interact with all resources available in Watson Machine Learning, you need to set space which you will be using.
client.set.default_space(space_id)
'SUCCESS'
2. Create function¶
In this section you will learn how to create deployable function
with statsmodels module calculating describition of a given data.
Hint: To install statsmodels execute !pip install statsmodels
.
Create deploayable callable which uses stsmodels library¶
def deployable_callable():
"""
Deployable python function with score
function implemented.
"""
try:
from statsmodels.stats.descriptivestats import describe
except ModuleNotFoundError as e:
print(f"statsmodels not installed: {str(e)}")
def score(payload):
"""
Score method.
"""
try:
data = payload['input_data'][0]['values']
return {
'predictions': [
{'values': str(describe(data))}
]
}
except Exception as e:
return {'predictions': [{'values': [repr(e)]}]}
return score
Test callable locally¶
Hint: To install numpy execute !pip install numpy
.
import numpy as np
data = np.random.randn(10, 10)
data_description = deployable_callable()({
"input_data": [{
"values" : data
}]
})
print(data_description["predictions"][0]["values"])
0 1 2 3 4 \ nobs 10.000000 10.000000 10.000000 10.000000 10.000000 missing 0.000000 0.000000 0.000000 0.000000 0.000000 mean 0.064913 0.245574 -0.289946 0.354056 0.039773 std_err 0.319769 0.384418 0.299716 0.282352 0.301534 upper_ci 0.691649 0.999020 0.297486 0.907455 0.630769 lower_ci -0.561823 -0.507872 -0.877377 -0.199344 -0.551224 std 1.011199 1.215637 0.947784 0.892875 0.953535 iqr 1.411132 2.243018 1.318020 1.662204 0.978132 iqr_normal 1.046074 1.662752 0.977050 1.232193 0.725090 mad 0.826575 1.063895 0.783538 0.742548 0.702243 mad_normal 1.035958 1.333395 0.982019 0.930646 0.880132 coef_var 15.577700 4.950184 -3.268833 2.521850 23.974640 range 3.347027 3.159219 2.711322 2.294751 3.304793 max 1.885559 1.702810 1.291228 1.439570 2.152607 min -1.461468 -1.456409 -1.420094 -0.855181 -1.152185 skew 0.190224 -0.239921 0.416229 -0.073070 0.851185 kurtosis 2.221879 1.427371 1.877756 1.577746 3.490762 jarque_bera 0.312589 1.126422 0.813508 0.851735 1.307881 jarque_bera_pval 0.855307 0.569378 0.665808 0.653203 0.519993 mode -1.461468 -1.456409 -1.420094 -0.855181 -1.152185 mode_freq 0.100000 0.100000 0.100000 0.100000 0.100000 median 0.115608 0.572933 -0.373414 0.377773 0.240160 1% -1.415495 -1.433457 -1.409014 -0.842809 -1.135009 5% -1.231604 -1.341646 -1.364698 -0.793321 -1.066302 10% -1.001740 -1.226882 -1.309302 -0.731462 -0.980418 25% -0.704587 -0.945256 -1.052150 -0.454896 -0.620614 50% 0.115608 0.572933 -0.373414 0.377773 0.240160 75% 0.706546 1.297762 0.265870 1.207308 0.357517 90% 0.978912 1.516863 1.006978 1.402771 0.604352 95% 1.432236 1.609837 1.149103 1.421170 1.378480 99% 1.794895 1.684215 1.262803 1.435890 1.997782 5 6 7 8 9 nobs 10.000000 10.000000 10.000000 10.000000 10.000000 missing 0.000000 0.000000 0.000000 0.000000 0.000000 mean -0.040171 0.362029 -0.521647 -0.343284 -0.325551 std_err 0.338607 0.264505 0.265433 0.203170 0.308506 upper_ci 0.623487 0.880450 -0.001407 0.054922 0.279111 lower_ci -0.703828 -0.156391 -1.041886 -0.741489 -0.930212 std 1.070769 0.836439 0.839374 0.642480 0.975582 iqr 1.474389 1.011247 0.502547 0.446155 1.223176 iqr_normal 1.092966 0.749638 0.372539 0.330735 0.906742 mad 0.871267 0.675793 0.528876 0.451689 0.753957 mad_normal 1.091972 0.846981 0.662848 0.566108 0.944945 coef_var -26.655545 2.310418 -1.609085 -1.871570 -2.996715 range 3.400602 2.377388 3.383200 2.314276 3.316589 max 1.249853 1.740157 1.242683 0.951935 0.923821 min -2.150750 -0.637231 -2.140517 -1.362341 -2.392769 skew -0.534469 0.407019 0.222036 0.531693 -0.745463 kurtosis 2.445541 1.967317 4.143304 3.058282 3.036370 jarque_bera 0.604190 0.720454 0.626809 0.472578 0.926743 jarque_bera_pval 0.739268 0.697518 0.730954 0.789552 0.629159 mode -2.150750 -0.637231 -2.140517 -1.362341 -2.392769 mode_freq 0.100000 0.100000 0.100000 0.100000 0.100000 median -0.084632 0.306461 -0.460995 -0.442976 -0.243133 1% -2.036969 -0.633988 -2.042521 -1.323388 -2.268127 5% -1.581844 -0.621016 -1.650536 -1.167573 -1.769563 10% -1.012938 -0.604801 -1.160554 -0.972804 -1.146357 25% -0.643756 -0.279652 -0.787225 -0.574531 -0.811178 50% -0.084632 0.306461 -0.460995 -0.442976 -0.243133 75% 0.830633 0.731594 -0.284678 -0.128376 0.411998 90% 1.106676 1.571031 -0.078543 0.386313 0.625307 95% 1.178264 1.655594 0.582070 0.669124 0.774564 99% 1.235535 1.723244 1.110560 0.895373 0.893969
3. Upload python function¶
In this section you will learn how to upload the python function to the Cloud.
Custom software_specification¶
Create new software specification based on default Python 3.11 environment extended by statsmodel package.
config_yml =\
"""
name: python311
channels:
- conda-forge
- nodefaults
dependencies:
- statsmodels
prefix: /opt/anaconda3/envs/python311
"""
with open("config.yaml", "w", encoding="utf-8") as f:
f.write(config_yml)
base_sw_spec_id = client.software_specifications.get_id_by_name("runtime-24.1-py3.11")
!cat config.yaml
name: python310 channels: - conda-forge - nodefaults dependencies: - statsmodels prefix: /opt/anaconda3/envs/python310
config.yaml
file describes details of package extention. Now you need to store new package extention with APIClient.
meta_prop_pkg_extn = {
client.package_extensions.ConfigurationMetaNames.NAME: "statsmodels env",
client.package_extensions.ConfigurationMetaNames.DESCRIPTION: "Environment with statsmodels",
client.package_extensions.ConfigurationMetaNames.TYPE: "conda_yml"
}
pkg_extn_details = client.package_extensions.store(meta_props=meta_prop_pkg_extn, file_path="config.yaml")
pkg_extn_id = client.package_extensions.get_id(pkg_extn_details)
pkg_extn_url = client.package_extensions.get_href(pkg_extn_details)
Creating package extensions SUCCESS
Create new software specification and add created package extention to it.¶
meta_prop_sw_spec = {
client.software_specifications.ConfigurationMetaNames.NAME: "statsmodels software_spec",
client.software_specifications.ConfigurationMetaNames.DESCRIPTION: "Software specification for statsmodels",
client.software_specifications.ConfigurationMetaNames.BASE_SOFTWARE_SPECIFICATION: {"guid": base_sw_spec_id}
}
sw_spec_details = client.software_specifications.store(meta_props=meta_prop_sw_spec)
sw_spec_id = client.software_specifications.get_id(sw_spec_details)
client.software_specifications.add_package_extension(sw_spec_id, pkg_extn_id)
SUCCESS
'SUCCESS'
Get the details of created software specification¶
client.software_specifications.get_details(sw_spec_id)
Store the function¶
meta_props = {
client.repository.FunctionMetaNames.NAME: "statsmodels function",
client.repository.FunctionMetaNames.SOFTWARE_SPEC_ID: sw_spec_id
}
function_details = client.repository.store_function(meta_props=meta_props, function=deployable_callable)
function_id = client.repository.get_function_id(function_details)
Get function details¶
client.repository.get_details(function_id)
{'entity': {'software_spec': {'id': '05020db0-8207-4ddd-90d3-17c6b83e55df', 'name': 'statsmodels software_spec1'}, 'type': 'python'}, 'metadata': {'created_at': '2024-03-06T11:17:31.294Z', 'id': '542ff1cc-6fa1-4315-b0fe-523539c85145', 'modified_at': '2024-03-06T11:17:33.569Z', 'name': 'statsmodels function', 'owner': 'IBMid-55000091VC', 'space_id': '93ee84d1-b7dd-42b4-b2ca-121bc0c86315'}, 'system': {'warnings': []}}
Note: You can see that function is successfully stored in Watson Machine Learning Service.
client.repository.list_functions()
4. Create online deployment¶
You can use commands bellow to create online deployment for stored function (web service).
Create online deployment of a python function¶
metadata = {
client.deployments.ConfigurationMetaNames.NAME: "Deployment of statsmodels function",
client.deployments.ConfigurationMetaNames.ONLINE: {}
}
function_deployment = client.deployments.create(function_id, meta_props=metadata)
client.deployments.list()
Get deployment id.
deployment_id = client.deployments.get_id(function_deployment)
print(deployment_id)
9b723c1c-5d8f-4f1f-bc4e-bf7d582a4ba6
scoring_payload = {
"input_data": [{
'values': data
}]
}
predictions = client.deployments.score(deployment_id, scoring_payload)
print(data_description["predictions"][0]["values"])
0 1 2 3 4 \ nobs 10.000000 10.000000 10.000000 10.000000 10.000000 missing 0.000000 0.000000 0.000000 0.000000 0.000000 mean 0.226671 0.118863 0.296608 0.018030 -0.059889 std_err 0.078214 0.109682 0.077958 0.069102 0.099104 upper_ci 0.379969 0.333837 0.449404 0.153467 0.134352 lower_ci 0.073374 -0.096110 0.143813 -0.117407 -0.254130 std 0.782143 1.096823 0.779584 0.691019 0.991042 iqr 1.138703 0.979375 1.111591 0.625231 1.315194 iqr_normal 0.844122 0.726012 0.824023 0.463485 0.974954 mad 0.623878 0.774813 0.618962 0.513830 0.799944 mad_normal 0.781915 0.971084 0.775754 0.643991 1.002581 coef_var 3.450557 9.227584 2.628330 38.326234 -16.547990 range 2.289961 3.407891 2.370988 2.429216 3.152545 max 1.216507 1.544682 1.624286 1.266127 1.517858 min -1.073454 -1.863208 -0.746702 -1.163089 -1.634687 skew -0.180136 -0.472191 0.300966 0.263092 0.206849 kurtosis 1.900409 2.384448 1.984736 2.702869 2.043656 jarque_bera 0.557873 0.529483 0.580451 0.152149 0.452392 jarque_bera_pval 0.756588 0.767404 0.748095 0.926747 0.797562 mode -1.073454 -1.863208 -0.746702 -1.163089 -1.634687 mode_freq 0.100000 0.100000 0.100000 0.100000 0.100000 median 0.131746 0.126305 0.196596 -0.075777 -0.289048 1% -1.035160 -1.819212 -0.736463 -1.097474 -1.557357 5% -0.881982 -1.643226 -0.695507 -0.835012 -1.248039 10% -0.690509 -1.423244 -0.644312 -0.506934 -0.861391 25% -0.265950 -0.093626 -0.179451 -0.372709 -0.669090 50% 0.131746 0.126305 0.196596 -0.075777 -0.289048 75% 0.872753 0.885749 0.932140 0.252522 0.646103 90% 1.207454 1.400947 1.147246 0.910790 1.227843 95% 1.211980 1.472815 1.385766 1.088459 1.372850 99% 1.215602 1.530309 1.576582 1.230594 1.488856 5 6 7 8 9 nobs 10.000000 10.000000 10.000000 10.000000 10.000000 missing 0.000000 0.000000 0.000000 0.000000 0.000000 mean -0.483244 -0.254453 0.058877 0.426315 0.070967 std_err 0.096181 0.121981 0.067393 0.132276 0.142787 upper_ci -0.294733 -0.015374 0.190965 0.685572 0.350825 lower_ci -0.671756 -0.493532 -0.073210 0.167058 -0.208891 std 0.961811 1.219812 0.673929 1.322764 1.427874 iqr 0.702320 1.694391 1.086336 0.818691 2.260862 iqr_normal 0.520631 1.256054 0.805302 0.606896 1.675980 mad 0.595403 0.992851 0.548831 0.775163 1.241234 mad_normal 0.746228 1.244354 0.687858 0.971523 1.555656 coef_var -1.990319 -4.793857 11.446314 3.102789 20.120208 range 3.290262 3.729499 2.020311 4.731702 4.155165 max 0.264072 1.728591 1.288685 3.818537 2.356818 min -3.026190 -2.000908 -0.731626 -0.913165 -1.798347 skew -2.016374 0.280287 0.361658 1.723070 0.344541 kurtosis 6.145500 1.926644 2.092429 5.591678 1.669951 jarque_bera 10.898847 0.610973 0.561196 7.746946 0.934943 jarque_bera_pval 0.004299 0.736765 0.755332 0.020786 0.626585 mode -3.026190 -2.000908 -0.731626 -0.913165 -1.798347 mode_freq 0.100000 0.100000 0.100000 0.100000 0.100000 median -0.281243 -0.424602 -0.082306 0.343253 -0.500832 1% -2.818282 -1.945829 -0.727344 -0.906817 -1.742649 5% -1.986648 -1.725515 -0.710218 -0.881427 -1.519857 10% -0.947106 -1.450121 -0.688811 -0.849689 -1.241366 25% -0.631465 -1.180252 -0.542524 -0.252573 -1.023235 50% -0.281243 -0.424602 -0.082306 0.343253 -0.500832 75% 0.070856 0.514139 0.543813 0.566118 1.237627 90% 0.251427 1.368129 0.707175 0.955073 1.826901 95% 0.257750 1.548360 0.997930 2.386805 2.091860 99% 0.262807 1.692545 1.230534 3.532191 2.303827
6. Clean up¶
If you want to clean up all created assets:
- experiments
- trainings
- pipelines
- model definitions
- models
- functions
- deployments
see the steps in this sample notebook.
7. Summary and next steps¶
You successfully completed this notebook! You learned how to use Watson Machine Learning for function deployment and scoring with custom software_spec.
Check out our Online Documentation for more samples, tutorials, documentation, how-tos, and blog posts.
Author¶
Jan Sołtysik Intern in Watson Machine Learning.
Copyright © 2020-2024 IBM. This notebook and its source code are released under the terms of the MIT License.