Want to do more?Sign Up
Try out this notebook with your free trial of IBM Watson Studio.

Welcome to PixieDust

This notebook features an introduction to PixieDust, the Python library that makes data visualization easy.

This notebook runs on Python 2.7 and 3.5, with Spark 2.1.

Table of Contents


Get started

This introduction is pretty straightforward, but it wouldn't hurt to load up the PixieDust documentation so it's handy.

New to notebooks? Don't worry. Here's all you need to know to run this introduction:

  1. Make sure this notebook is in Edit mode
  2. To run code cells, put your cursor in the cell and press Shift + Enter.
  3. The cell number will change to [*] to indicate that it is currently executing. (When starting with notebooks, it's best to run cells in order, one at a time.)
In [ ]:
# To confirm you have the latest version of PixieDust on your system, run this cell
!pip install -U --no-deps pixiedust

Now that you have PixieDust installed and up-to-date on your system, you need to import it into this notebook. This is the last dependency before you can play with PixieDust.

In [2]:
import pixiedust
Pixiedust database opened successfully
Pixiedust version 1.1.9

If you get a message telling you that you're not running the latest version of PixieDust, restart the kernel from the Kernel menu and rerun the import pixiedust command. (Any time you restart the kernel, rerun the import pixiedust command.)

Behold, display()

In the next cell, build a simple dataset and store it in a variable.

In [3]:
# Build the SQL context required to create a Spark dataframe 
sqlContext=SQLContext(sc) 
# Create the Spark dataframe, passing in some data, and assign it to a variable
df = spark.createDataFrame(
[("Green", 75),
 ("Blue", 25)],
["Colors","%"])

The data in the variable df is ready to be visualized, without any further code other than the call to display().

In [4]:
# display the dataframe above as a pie chart
display(df)
Hey, there's something awesome here! To see it, open this notebook outside GitHub, in a viewer like Jupyter
Colors in this pie chart,by %

After running the cell above, you should see a Spark DataFrame displayed as a pie chart, along with some controls to tweak the display. All that came from passing the DataFrame variable to display().

In the next cell, you'll pass more interesting data to display(), which will also offer more advanced controls.

In [5]:
# create another DataFrame, in a new variable
df2 = spark.createDataFrame(
[(2010, 'Camping Equipment', 3),
 (2010, 'Golf Equipment', 1),
 (2010, 'Mountaineering Equipment', 1),
 (2010, 'Outdoor Protection', 2),
 (2010, 'Personal Accessories', 2),
 (2011, 'Camping Equipment', 4),
 (2011, 'Golf Equipment', 5),
 (2011, 'Mountaineering Equipment',2),
 (2011, 'Outdoor Protection', 4),
 (2011, 'Personal Accessories', 2),
 (2012, 'Camping Equipment', 5),
 (2012, 'Golf Equipment', 5),
 (2012, 'Mountaineering Equipment', 3),
 (2012, 'Outdoor Protection', 5),
 (2012, 'Personal Accessories', 3),
 (2013, 'Camping Equipment', 8),
 (2013, 'Golf Equipment', 5),
 (2013, 'Mountaineering Equipment', 3),
 (2013, 'Outdoor Protection', 8),
 (2013, 'Personal Accessories', 4)],
["year","category","unique_customers"])

# This time, we've combined the dataframe and display() call in the same cell
# Run this cell 
display(df2)
Hey, there's something awesome here! To see it, open this notebook outside GitHub, in a viewer like Jupyter
Customers by Category clustered by Year

display() controls

Renderers

The chart above, like the first one, is rendered by matplotlib. With PixieDust, you have other options. To toggle between renderers, use the Renderers control at top right of the display output:

  1. Bokeh is interactive; play with the controls along the top of the chart, for example, zoom and save
  2. Matplotlib is static; you can save the image as a PNG

Chart options

  1. Chart types: At top left, you should see an option to display the dataframe as a table. You should also see a dropdown menu with other chart options, including bar charts, pie charts, scatter plots, and so on.
  2. Options: Click the Options button to explore other display configurations; for example, clustering and aggregation.

Here's more on customizing display() output.

Load External Data

So far, you've worked with data hard-coded into our notebook. Now, load external data (CSV) from a URL.

In [ ]:
# load a CSV with pixiedust.sampleData()
df3 = pixiedust.sampleData("https://github.com/ibm-watson-data-lab/open-data/raw/master/cars/cars.csv")
display(df3)
Hey, there's something awesome here! To see it, open this notebook outside GitHub, in a viewer like Jupyter
Distribution of MPG per Horsepower

You should see a scatterplot above, rendered again by matplotlib. Find the Renderer menu at top-right. You should see options for Bokeh and Seaborn. If you don't see Seaborn, it's not installed on your system. No problem, just install it by running the next cell.

In [7]:
# To install Seaborn, uncomment the next line, and then run this cell
#!pip install --user seaborn

If you installed Seaborn, you'll need to also restart your notebook kernel, and run the cell to import pixiedust again. Find Restart in the Kernel menu above.

End of chapter. Return to table of contents


Data files commonly reside in remote sources, such as such as public or private market places or GitHub repositories. You can load comma separated value (csv) data files using Pixiedust's sampleData method.

Prerequisites

If you haven't already, import PixieDust. Follow the instructions in Get started.

Load data

To load a data set, run pixiedust.sampleData and specify the data set URL:

In [8]:
homes = pixiedust.sampleData("https://openobjectstore.mybluemix.net/misc/milliondollarhomes.csv")
Downloading 'https://openobjectstore.mybluemix.net/misc/milliondollarhomes.csv' from https://openobjectstore.mybluemix.net/misc/milliondollarhomes.csv
Downloaded 102051 bytes
Creating pySpark DataFrame for 'https://openobjectstore.mybluemix.net/misc/milliondollarhomes.csv'. Please wait...
Loading file using 'SparkSession'
Successfully created pySpark DataFrame for 'https://openobjectstore.mybluemix.net/misc/milliondollarhomes.csv'

The pixiedust.sampleData method loads the data into an Apache Spark DataFrame, which you can inspect and visualize using display().

Inspect and preview the loaded data

To inspect the automatically inferred schema and preview a small subset of the data, you can use the DataFrame Table view, as shown in this preconfigured example:

In [9]:
display(homes)
Hey, there's something awesome here! To see it, open this notebook outside GitHub, in a viewer like Jupyter
Field types:
PROPERTY TYPE: object
ADDRESS: object
CITY: object
STATE: object
ZIP: int64
PRICE: int64
BEDS: float64
BATHS: float64
LOCATION: object
SQFT: float64
LOT SIZE: float64
YEAR BUILT: float64
DAYS ON MARKET: float64
URL: object
SOURCE: object
LISTING ID: float64
LATITUDE: float64
LONGITUDE: float64
Showing 100 of 500 rows
PROPERTY TYPE
ADDRESS
CITY
STATE
ZIP
PRICE
BEDS
BATHS
LOCATION
SQFT
LOT SIZE
YEAR BUILT
DAYS ON MARKET
URL
SOURCE
LISTING ID
LATITUDE
LONGITUDE
PROPERTY TYPE
ADDRESS
CITY
STATE
ZIP
PRICE
BEDS
BATHS
LOCATION
SQFT
LOT SIZE
YEAR BUILT
DAYS ON MARKET
URL
SOURCE
LISTING ID
LATITUDE
LONGITUDE
Condo/Co-op 30 Winchester St #3 Brookline MA 2446 1400000 3.0 3.0 Coolidge Corner 1504.0 nan 1915.0 66.0 http://www.redfin.com/MA/Brookline/30-Winchester-St-02446/unit-3/home/105251020 MLS PIN 58480309.0 42.3420632 -71.1257602
Single Family Residential 2 Wellington Way Bedford MA 1730 1150000 4.0 3.5 Wellington Way 3531.0 43560.0 2012.0 58.0 http://www.redfin.com/MA/Bedford/2-Wellington-Way-01730/home/41363649 MLS PIN 59806880.0 42.5029123 -71.2849657
Condo/Co-op 1 Franklin St #1008 Boston MA 2110 2049000 2.0 2.0 Midtown 1476.0 nan 2016.0 59.0 http://www.redfin.com/MA/Boston/1-Franklin-St-02108/unit-1008/home/109481369 MLS PIN 62725868.0 42.35631 -71.05945
Single Family Residential 1 Wilshire Rd Newbury MA 1951 2225000 4.0 5.5 Wilshire Road 4214.0 18138.0 2014.0 58.0 http://www.redfin.com/MA/Newbury/1-Wilshire-Rd-01951/home/105539600 MLS PIN 59011440.0 42.7796754 -70.8476708
Townhouse 170 Harvard St Unit 1 Newton MA 2460 1100000 4.0 3.5 Newtonville 2388.0 10089.0 1910.0 66.0 http://www.redfin.com/MA/Newton/170-Harvard-St-02460/unit-1/home/109313528 MLS PIN 62550577.0 42.3468986 -71.2005455
Single Family Residential 1 Jerusalem Ln Cohasset MA 2025 1437000 4.0 3.5 Jerusalem Road/Atlantic Avenue/Jerusalem Lane Cul De Sac 2724.0 9443.0 2000.0 66.0 http://www.redfin.com/MA/Cohasset/1-Jerusalem-Ln-02025/home/8835487 MLS PIN 60777396.0 42.259862 -70.811424
Single Family Residential 34 Crestwood Rd Marblehead MA 1945 2997000 1.0 5.0 None 8509.0 34400.0 2012.0 nan http://www.redfin.com/MA/Marblehead/34-Crestwood-Rd-01945/home/11768413 None nan 42.501777 -70.877002
Single Family Residential 217 Forest St Winchester MA 1890 1155000 4.0 3.5 Muraco School District 3779.0 9234.0 2013.0 72.0 http://www.redfin.com/MA/Winchester/217-Forest-St-01890/home/110057079 MLS PIN 56198065.0 42.4714541 -71.1156268
Single Family Residential 1 Denny St Westborough MA 1581 1100000 6.0 3.5 Westborough 3394.0 239580.0 1917.0 79.0 http://www.redfin.com/MA/Westborough/1-Denny-St-01581/home/16634032 MLS PIN 60054100.0 42.259306 -71.611866
Single Family Residential 23 Laurel Hill Ln Winchester MA 1890 1475000 5.0 4.0 Winchester 4037.0 12475.0 2014.0 80.0 http://www.redfin.com/MA/Winchester/23-Laurel-Hill-Ln-01890/home/11439968 MLS PIN 58047231.0 42.4724111 -71.1196572
Single Family Residential 14 Hillside Ave Cambridge MA 2140 5350000 3.0 2.5 None 4557.0 15526.0 1905.0 nan http://www.redfin.com/MA/Cambridge/14-Hillside-Ave-02140/home/110093849 None nan 42.3853181 -71.1238296
Single Family Residential 22 Garfield Rd Belmont MA 2478 1560000 5.0 3.5 Belmont 3137.0 12614.0 1936.0 80.0 http://www.redfin.com/MA/Belmont/22-Garfield-Rd-02478/home/8452209 MLS PIN 61669441.0 42.4033462 -71.1745363
Single Family Residential 721 Pleasant St Belmont MA 2478 1480000 7.0 5.0 Belmont Center 6000.0 43900.0 1837.0 78.0 http://www.redfin.com/MA/Belmont/721-Pleasant-St-02478/home/11774971 MLS PIN 61594748.0 42.39572 -71.179743
Single Family Residential 12 Thoreau Rd Lexington MA 2420 1403000 4.0 2.0 Burnham Farms Estates 2263.0 38712.0 1959.0 81.0 http://www.redfin.com/MA/Lexington/12-Thoreau-Rd-02420/home/8577537 MLS PIN 61594796.0 42.463883 -71.211304
Single Family Residential 22 Leighton Rd Wellesley MA 2482 1530000 5.0 3.5 Dana Hall 3574.0 10000.0 1916.0 78.0 http://www.redfin.com/MA/Wellesley/22-Leighton-Rd-02482/home/8986596 MLS PIN 59747229.0 42.290604 -71.295376
Single Family Residential 123 Abbott Rd Wellesley MA 2481 2184000 6.0 5.5 Country Club 6119.0 34050.0 1905.0 78.0 http://www.redfin.com/MA/Wellesley/123-Abbott-Rd-02481/home/11724812 MLS PIN 56084352.0 42.303269 -71.267922
Condo/Co-op 41 Milford St #2 Boston MA 2118 2175000 3.0 2.5 South End 2150.0 2150.0 1890.0 84.0 http://www.redfin.com/MA/Boston/41-Milford-St-02118/unit-2/home/28465481 MLS PIN 61799764.0 42.344312 -71.069765
Condo/Co-op 250 Boylston St Unit 5 Boston MA 2116 11500000 4.0 4.5 None 4841.0 nan 1900.0 nan http://www.redfin.com/MA/Boston/250-Boylston-St-02116/unit-5/home/9313437 None nan 42.351883 -71.0693583
Condo/Co-op 65 E India Row Apt 17A Boston MA 2110 2050000 1.0 1.0 None 754.0 nan 1972.0 nan http://www.redfin.com/MA/Boston/65-E-India-Row-02110/unit-17A/home/9260829 None nan 42.3576699 -71.0504859
Condo/Co-op 110 Stuart St Unit 25J Boston MA 2116 1110000 1.0 1.5 Midtown 1103.0 nan 2009.0 88.0 http://www.redfin.com/MA/Boston/110-Stuart-St-02116/unit-25J/home/39913276 MLS PIN 60206455.0 42.3509321 -71.0653685
Condo/Co-op 165 Tremont St Unit 1502 Boston MA 2111 2365000 2.0 2.0 None 1054.0 nan 2003.0 nan http://www.redfin.com/MA/Boston/165-Tremont-St-02111/unit-1502/home/9329174 None nan 42.3539223 -71.0636921
Condo/Co-op 2 Avery St Unit 18E Boston MA 2111 3055000 3.0 3.5 Midtown 2344.0 nan 2000.0 87.0 http://www.redfin.com/MA/Boston/2-Avery-St-02111/unit-18E/home/11736816 MLS PIN 54707422.0 42.3527843 -71.0630991
Single Family Residential 142 Farm St Dover MA 2030 1125000 4.0 2.5 None 3273.0 89734.0 1859.0 nan http://www.redfin.com/MA/Dover/142-Farm-St-02030/home/11708174 None nan 42.221839 -71.321119
Single Family Residential 294 Central Ave Needham MA 2494 1049900 4.0 2.5 Needham 2938.0 10019.0 2011.0 87.0 http://www.redfin.com/MA/Needham/294-Central-Ave-02494/home/8923881 MLS PIN 61048054.0 42.306322 -71.237639
Townhouse 113 Huron Ave #2 Cambridge MA 2138 1320000 2.0 2.5 None 2095.0 nan 1900.0 nan http://www.redfin.com/MA/Cambridge/113-Huron-Ave-02138/unit-2/home/11588979 None nan 42.3834761 -71.1300727
Single Family Residential 360 K St Boston MA 2127 1225000 5.0 4.0 South Boston 4544.0 1425.0 1890.0 86.0 http://www.redfin.com/MA/Boston/360-K-St-02127/home/9193302 MLS PIN 59731322.0 42.3315355 -71.037093
Single Family Residential 206 Cliff Wellesley MA 2481 2550000 6.0 6.0 Cliff Estates 5950.0 20037.0 1997.0 84.0 http://www.redfin.com/MA/Wellesley/206-Cliff-Rd-02481/home/8981978 MLS PIN 36310769.0 42.321743 -71.290464
Single Family Residential 19 Morrison Rd Burlington MA 1803 1150000 4.0 2.5 None 4502.0 20000.0 2006.0 nan http://www.redfin.com/MA/Burlington/19-Morrison-Rd-01803/home/8444775 None nan 42.5173031 -71.2208988
Single Family Residential 6 Wendell St Winchester MA 1890 1340000 4.0 2.5 Winchester 3270.0 10400.0 2016.0 74.0 http://www.redfin.com/MA/Winchester/6-Wendell-St-01890/home/11453144 MLS PIN 60180977.0 42.464858 -71.137687
Single Family Residential 138 Bridge St Manchester MA 1944 1520000 4.0 3.0 Manchester 3518.0 55019.0 1900.0 87.0 http://www.redfin.com/MA/Manchester-by-the-Sea/138-Bridge-St-01944/home/11334731 MLS PIN 60847220.0 42.568452 -70.787117
Single Family Residential 95 Chittenden Ln Unit 17 Lindsay Cohasset MA 2025 1212393 2.0 2.5 Cohasset 2480.0 nan 2014.0 50.0 http://www.redfin.com/MA/Cohasset/95-Chittenden-Ln-02025/unit-17LINDSAY/home/55358164 MLS PIN 27195446.0 42.2357241 -70.8193172
Townhouse 285 Nahanton St #285 Newton MA 2459 1350000 3.0 4.5 Newton Center 3854.0 nan 2015.0 57.0 http://www.redfin.com/MA/Newton/285-Nahanton-St-02459/unit-285/home/108302796 MLS PIN 60191934.0 42.2974082 -71.2000151
Single Family Residential 185 Paul Revere Rd Needham MA 2494 1087000 4.0 2.5 Tower Hill 3641.0 14810.0 1955.0 36.0 http://www.redfin.com/MA/Needham/185-Paul-Revere-Rd-02494/home/8915982 MLS PIN 62451151.0 42.292259 -71.2242433
Single Family Residential 63 Damon Needham MA 2494 1385000 5.0 3.5 Needham 4600.0 10019.0 2016.0 29.0 http://www.redfin.com/MA/Needham/63-Damon-Rd-02494/home/8932197 MLS PIN 61863832.0 42.290309 -71.242329
Single Family Residential 181 Fair Oaks Park Needham MA 2492 1750000 5.0 5.5 Needham 5300.0 17860.0 2015.0 15.0 http://www.redfin.com/MA/Needham/181-Fair-Oaks-Park-02492/home/88915658 MLS PIN 47388500.0 42.2783925 -71.2317863
Single Family Residential 15 Little Boot Ln Westwood MA 2090 1967500 6.0 6.0 Westwood 6697.0 51761.0 2005.0 21.0 http://www.redfin.com/MA/Westwood/15-Little-Boot-Ln-02090/home/8984230 MLS PIN 58910699.0 42.223465 -71.236437
Single Family Residential 287 Hillside St Milton MA 2186 1672500 8.0 5.5 Blue Hills 6950.0 109771.0 1912.0 17.0 http://www.redfin.com/MA/Milton/287-Hillside-St-02186/home/8936111 MLS PIN 62572474.0 42.224508 -71.081359
Single Family Residential 7 Centre St Dover MA 2030 1100000 4.0 4.5 Dover 4596.0 43560.0 1986.0 38.0 http://www.redfin.com/MA/Dover/7-Centre-St-02030/home/11706795 MLS PIN 61537334.0 42.256763 -71.272509
Single Family Residential 203 Dedham St Dover MA 2030 3166600 5.0 6.0 Dover 9981.0 138390.0 2008.0 30.0 http://www.redfin.com/MA/Dover/203-Dedham-St-02030/home/18978390 MLS PIN 51699718.0 42.251977 -71.256074
Single Family Residential 15 Old Farm Rd Dover MA 2030 1533000 4.0 3.5 Dover 4450.0 43985.0 2004.0 14.0 http://www.redfin.com/MA/Dover/15-Old-Farm-Rd-02030/home/11708009 MLS PIN 62467843.0 42.2415779 -71.272977
Single Family Residential 125 Grove St Wellesley MA 2482 1335000 4.0 3.5 Dana Hall 2556.0 15184.0 1952.0 23.0 http://www.redfin.com/MA/Wellesley/125-Grove-St-02482/home/8983757 MLS PIN 62450180.0 42.29025 -71.290277
Single Family Residential 170 Oxbow Rd Needham MA 2492 1600000 6.0 9.5 Needham 9381.0 62291.0 2006.0 42.0 http://www.redfin.com/MA/Needham/170-Oxbow-Rd-02492/home/12404250 MLS PIN 58292520.0 42.263322 -71.270861
Single Family Residential 158 Winding River Rd Wellesley MA 2482 2100000 4.0 3.5 Wellesley 4276.0 62291.0 1968.0 7.0 http://www.redfin.com/MA/Wellesley/158-Winding-River-Rd-02482/home/8987120 MLS PIN 62840695.0 42.2736 -71.294696
Single Family Residential 36 Walnut Hill Ln Cohasset MA 2025 1046797 4.0 4.5 Estates at Cohasset 3788.0 11064.0 2016.0 59.0 http://www.redfin.com/MA/Cohasset/36-Walnut-Hill-Ln-02025/home/76487342 MLS PIN 51061653.0 42.2276623 -70.8033216
Condo/Co-op 448 Beacon #3 Boston MA 2115 3150000 2.0 2.5 Back Bay 2351.0 nan 2016.0 39.0 http://www.redfin.com/MA/Boston/448-Beacon-St-02115/unit-3/home/58643785 MLS PIN 55834875.0 42.351749 -71.087326
Townhouse 2-14 Saint Paul St #4 Brookline MA 2446 1050000 2.0 2.5 Brookline Village 1636.0 nan 2003.0 21.0 http://www.redfin.com/MA/Brookline/2-Saint-Paul-St-02446/unit-4/home/56374095 MLS PIN 63077336.0 42.3371121 -71.1191794
Single Family Residential 45 Rivard Rd Needham MA 2492 1269000 4.0 4.0 Needham 3566.0 10019.0 1968.0 36.0 http://www.redfin.com/MA/Needham/45-Rivard-Rd-02492/home/8937183 MLS PIN 61836076.0 42.279984 -71.246874
Townhouse 403-405 Parker St Newton MA 2459 1299900 4.0 4.5 Newton Center 4421.0 17048.0 2016.0 23.0 http://www.redfin.com/MA/Newton/403-405-Parker-St-02459/Frnt/home/109018446 MLS PIN 61826458.0 42.3028239 -71.1864397
Townhouse 238 Tappan St #238 Brookline MA 2445 1650000 5.0 3.5 Brookline 2983.0 nan 1920.0 37.0 http://www.redfin.com/MA/Brookline/238-Tappan-St-02445/unit-238/home/109016355 MLS PIN 61816946.0 42.3343048 -71.1358626
Single Family Residential 16 Merrill Rd Newton MA 2467 1887000 8.0 5.5 None 5805.0 17170.0 1916.0 nan http://www.redfin.com/MA/Newton/16-Merrill-Rd-02467/home/11458536 None nan 42.3375804 -71.178873
Condo/Co-op 1 Franklin St #3802 Boston MA 2110 2875000 2.0 2.0 Midtown 1486.0 nan 2016.0 57.0 http://www.redfin.com/MA/Boston/1-Franklin-ST-02108/unit-3802/home/63128678 MLS PIN 36295606.0 42.35631 -71.05945
Condo/Co-op 17 Ridgeway Ln #17 Boston MA 2114 1685000 2.0 2.0 Beacon Hill 1603.0 1603.0 1850.0 43.0 http://www.redfin.com/MA/Boston/17-Ridgeway-Ln-02114/unit-17/home/109599394 MLS PIN 62336260.0 42.3606561 -71.064064
Condo/Co-op 9 Woods Pl Unit TH Boston MA 2129 1175000 3.0 3.0 Charlestown 1795.0 1173.0 2013.0 30.0 http://www.redfin.com/MA/Boston/9-Woods-Pl-02129/unit-TH/home/108578742 MLS PIN 61344386.0 42.3817625 -71.0666599
Condo/Co-op 38 S Russell St Unit 1A Boston MA 2114 1385000 2.0 2.5 Beacon Hill 1460.0 nan 1906.0 23.0 http://www.redfin.com/MA/Boston/38-S-Russell-St-02114/unit-1A/home/108563449 MLS PIN 61282387.0 42.3600774 -71.0659607
Condo/Co-op 893 Broadway #2 Somerville MA 2144 1047500 4.0 3.5 Somerville 2250.0 nan 1906.0 29.0 http://www.redfin.com/MA/Somerville/893-Broadway-02144/unit-2/home/104213405 MLS PIN 61212068.0 42.4009735 -71.1186959
Condo/Co-op 92 Blossomcrest Rd #92 Lexington MA 2421 1180000 5.0 3.5 Lexington 4321.0 11088.0 2016.0 29.0 http://www.redfin.com/MA/Lexington/92-Blossom-Crest-Rd-02421/unit-92/home/105669724 MLS PIN 62352483.0 42.42271 -71.2258559
Single Family Residential 61 Pheasant Landing Rd Needham MA 2492 1320000 5.0 4.5 Needham 5000.0 44431.0 1988.0 15.0 http://www.redfin.com/MA/Needham/61-Pheasant-Landing-Rd-02492/home/8943739 MLS PIN 58878208.0 42.272819 -71.282943
Single Family Residential 7 Gannett Rd Scituate MA 2066 1320000 4.0 3.5 Minot 2750.0 7749.0 1930.0 37.0 http://www.redfin.com/MA/Scituate/7-Gannett-Rd-02066/home/16440288 MLS PIN 59793861.0 42.23432 -70.759825
Single Family Residential 47 Holbrook St Boston MA 2130 1250000 4.0 2.5 Jamaica Plain Pondside 2257.0 4340.0 1905.0 45.0 http://www.redfin.com/MA/Boston/47-Holbrook-St-02130/home/9147705 MLS PIN 62413863.0 42.31126 -71.11781
Single Family Residential 90 PLEASANT STREET (SOUTH) Natick MA 1760 1262500 5.0 5.0 South Natick 5809.0 42689.0 2006.0 44.0 http://www.redfin.com/MA/Natick/90-Pleasant-St-S-01760/home/11673044 MLS PIN 61350064.0 42.262993 -71.307029
Single Family Residential 90 Jason St Arlington MA 2476 1635500 5.0 3.5 Jason Heights 4234.0 9600.0 1910.0 44.0 http://www.redfin.com/MA/Arlington/90-Jason-St-02476/home/8445908 MLS PIN 62529949.0 42.4117495 -71.1614552
Single Family Residential 25 Wauwinet Rd Newton MA 2465 1117500 4.0 2.5 West Newton 2341.0 12385.0 1935.0 45.0 http://www.redfin.com/MA/Newton/25-Wauwinet-Rd-02465/home/11429634 MLS PIN 62347610.0 42.3391264 -71.2173824
Single Family Residential 23 Tyng St Newburyport MA 1950 1050000 4.0 3.5 Newburyport 3854.0 10000.0 1855.0 53.0 http://www.redfin.com/MA/Newburyport/23-Tyng-St-01950/home/8336604 MLS PIN 62585686.0 42.816517 -70.883562
Single Family Residential 12 Charlemont St Newton MA 2461 1250000 4.0 3.5 Newton 3625.0 8251.0 2016.0 28.0 http://www.redfin.com/MA/Newton/12-Charlemont-St-02461/home/108305589 MLS PIN 60201485.0 42.3070459 -71.2066999
Townhouse 21 Crenshaw Ln Unit 15-1 Andover MA 1810 1350000 2.0 2.5 Andover 3300.0 nan 2016.0 23.0 http://www.redfin.com/MA/Andover/21-Crenshaw-Ln-01810/unit-15-1/home/110162354 MLS PIN 63026919.0 42.663395 -71.167584
Single Family Residential 5 Wallingford Rd Marblehead MA 1945 1582050 6.0 4.5 Marblehead Neck 4251.0 17249.0 1890.0 44.0 http://www.redfin.com/MA/Marblehead/5-Wallingford-Rd-01945/home/11772599 MLS PIN 61784413.0 42.496315 -70.8412479
Condo/Co-op 67 Clark Ave Chelsea MA 2150 1050000 22.0 nan None 8847.0 nan 1910.0 nan http://www.redfin.com/MA/Chelsea/67-Clark-Ave-02150/home/9014071 None nan 42.397579 -71.026754
Townhouse 235 Walnut St #235 Brookline MA 2445 1875000 4.0 4.5 Brookline 3400.0 nan 2015.0 29.0 http://www.redfin.com/MA/Brookline/235-Walnut-St-02445/unit-235/home/110158950 MLS PIN 60612099.0 42.3294662 -71.1233776
Townhouse 18 Decatur St #18 Cambridge MA 2139 1200000 3.0 2.5 Cambridgeport 1484.0 nan 2004.0 42.0 http://www.redfin.com/MA/Cambridge/18-Decatur-St-02139/unit-18/home/110298058 MLS PIN 62840501.0 42.3608678 -71.1053265
Townhouse 144 Pleasant St #144 Brookline MA 2446 1349000 4.0 2.5 Coolidge Corner 2200.0 4624.0 2016.0 43.0 http://www.redfin.com/MA/Brookline/144-Pleasant-St-02446/unit-144/home/108637263 MLS PIN 61476869.0 42.347628 -71.1183049
Single Family Residential 17 Unity Ln Sherborn MA 1770 1350000 5.0 4.0 Sherborn 4477.0 531258.0 1995.0 43.0 http://www.redfin.com/MA/Sherborn/17-Unity-Ln-01770/home/8678732 MLS PIN 56181861.0 42.245268 -71.373125
Single Family Residential 85 Schoolmaster Ln Dedham MA 2026 1600000 5.0 5.5 Dedham 6617.0 463477.0 1993.0 49.0 http://www.redfin.com/MA/Dedham/85-Schoolmasters-Ln-02026/home/8879241 MLS PIN 56089745.0 42.255807 -71.214156
Single Family Residential 29 LIVINGSTON Cir Needham MA 2492 1290000 5.0 3.5 Needham 4618.0 10019.0 2002.0 50.0 http://www.redfin.com/MA/Needham/29-Livingston-Cir-02492/home/8899108 MLS PIN 61813094.0 42.270104 -71.215971
Single Family Residential 18 Holland St Needham MA 2492 1557500 4.0 2.5 None 3816.0 10454.0 2005.0 nan http://www.redfin.com/MA/Needham/18-Holland-St-02492/home/8914194 None nan 42.2851454 -71.2289256
Single Family Residential 30 Benvenue St Wellesley MA 2482 2530000 6.0 7.0 Dana Hall 6514.0 26456.0 2014.0 39.0 http://www.redfin.com/MA/Wellesley/30-Benvenue-St-02482/home/40450634 MLS PIN 59111302.0 42.286924 -71.290032
Single Family Residential 28 Cross St Dover MA 2030 1000000 5.0 4.0 None 4883.0 43940.0 1988.0 nan http://www.redfin.com/MA/Dover/28-Cross-St-02030/home/11706849 None nan 42.251862 -71.268592
Single Family Residential 73 BAYLEY St Westwood MA 2090 1125000 4.0 2.5 High School 3000.0 12196.0 2016.0 43.0 http://www.redfin.com/MA/Westwood/73-Bayley-St-02090/home/8971941 MLS PIN 62161236.0 42.214221 -71.2199119
Townhouse 285 Nahanton St Newton MA 2459 1350000 3.0 3.5 None 2616.0 1254089.0 1988.0 nan http://www.redfin.com/MA/Newton/285-Nahanton-St-02459/home/11490092 None nan 42.2973699 -71.2000765
Single Family Residential 42 Amherst Rd Wellesley MA 2482 1590000 4.0 3.5 Wellesley 2655.0 11174.0 1932.0 58.0 http://www.redfin.com/MA/Wellesley/42-Amherst-Rd-02482/home/8980804 MLS PIN 62452658.0 42.2967153 -71.2856102
Single Family Residential 38 Sterling Rd Wellesley MA 2482 1300000 4.0 3.5 Wellesley 2616.0 15032.0 1938.0 52.0 http://www.redfin.com/MA/Wellesley/38-Sterling-Rd-02482/home/11725244 MLS PIN 62068889.0 42.292352 -71.283118
Townhouse 5 Wellington St #1 Boston MA 2118 1663000 2.0 2.5 South End 1541.0 1541.0 1910.0 57.0 http://www.redfin.com/MA/Boston/5-Wellington-St-02118/unit-1/home/9283645 MLS PIN 61953379.0 42.3415652 -71.0814819
Vacant Land 3 Possum Hollow Rd Andover MA 1810 1138000 nan nan None nan 55191.0 nan nan http://www.redfin.com/MA/Andover/3-Possum-Hollow-Rd-01810/home/76492720 None nan 42.677729 -71.207592
Single Family Residential 2 Wellington Way Bedford MA 1730 1150000 4.0 2.5 None 2944.0 43560.0 2012.0 nan http://www.redfin.com/MA/Bedford/2-Wellington-Way-01730/home/109826855 None nan 42.5029123 -71.2849657
Single Family Residential 140 Woodside Rd Sudbury MA 1776 1325000 4.0 3.5 Neighborhood near Hopestill Brown 4400.0 40182.0 2015.0 52.0 http://www.redfin.com/MA/Sudbury/140-Woodside-Rd-01776/home/11683285 MLS PIN 60173055.0 42.346602 -71.412651
Single Family Residential 46 Ferncroft Rd Newton MA 2468 1225000 5.0 3.0 Waban 3000.0 9149.0 1940.0 53.0 http://www.redfin.com/MA/Newton/46-Ferncroft-Rd-02468/home/11450747 MLS PIN 61122691.0 42.3319084 -71.2208592
Single Family Residential 10 Reservoir Rd Wayland MA 1778 1025500 5.0 3.5 Wayland 3950.0 63049.0 1958.0 51.0 http://www.redfin.com/MA/Wayland/10-Reservoir-Rd-01778/home/11690009 MLS PIN 62564502.0 42.336269 -71.344055
Single Family Residential 18 Walnut St Lexington MA 2421 1899000 5.0 5.0 Lexington 6074.0 17424.0 2015.0 57.0 http://www.redfin.com/MA/Lexington/18-Walnut-St-02421/home/8541326 MLS PIN 62347793.0 42.414408 -71.219675
Single Family Residential 191 Oakland Ave Arlington MA 2476 1025900 4.0 3.0 Arlington Heights 2207.0 5000.0 1925.0 49.0 http://www.redfin.com/MA/Arlington/191-Oakland-Ave-02476/home/8456552 MLS PIN 62449525.0 42.4166214 -71.1839804
Single Family Residential 653 Lowell St Lexington MA 2420 1230000 4.0 3.0 Lexington 3328.0 30000.0 2014.0 72.0 http://www.redfin.com/MA/Lexington/655-Lowell-St-02420/home/8580353 MLS PIN 61459479.0 42.4665127 -71.2083928
Single Family Residential 85 Blake Rd Lexington MA 2420 1075000 4.0 2.5 Lexington 2540.0 9200.0 1948.0 59.0 http://www.redfin.com/MA/Lexington/85-Blake-Rd-02420/home/8581460 MLS PIN 61943907.0 42.46773 -71.23467
Single Family Residential 8 Ellison Rd Lexington MA 2421 1540000 6.0 5.5 Hastings School Neighborhood 6729.0 10281.0 2013.0 49.0 http://www.redfin.com/MA/Lexington/8-Ellison-Rd-02421/home/8562034 MLS PIN 59015573.0 42.442559 -71.252846
Single Family Residential 12 Manchester Rd Winchester MA 1890 1009000 5.0 2.0 Winchester 3109.0 12834.0 1910.0 43.0 http://www.redfin.com/MA/Winchester/12-Manchester-Rd-01890/home/11448542 MLS PIN 62030397.0 42.448237 -71.135919
Single Family Residential 36 Irving St #5 Cambridge MA 2138 1037000 2.0 2.0 Harvard Square 1302.0 1807.0 1978.0 49.0 http://www.redfin.com/MA/Cambridge/36-Irving-St-02138/unit-5/home/11572850 MLS PIN 62661761.0 42.3763419 -71.1105207
Townhouse 162 Appleton St Cambridge MA 2138 1550000 3.0 3.5 None 2434.0 nan 1886.0 nan http://www.redfin.com/MA/Cambridge/162-Appleton-St-02138/home/39915400 None nan 42.3819163 -71.1337688
Condo/Co-op 1 Monument Sq Boston MA 2129 2350000 3.0 2.5 None 2415.0 nan 1899.0 nan http://www.redfin.com/MA/Boston/1-Monument-Sq-02129/home/9323443 None nan 42.375331 -71.060148
Condo/Co-op 221 Mount Auburn St #608 Cambridge MA 2138 1350000 2.0 2.0 Harvard Square 1191.0 nan 1960.0 49.0 http://www.redfin.com/MA/Cambridge/221-Mount-Auburn-St-02138/unit-608/home/11587560 MLS PIN 62340767.0 42.3749516 -71.129983
Single Family Residential 20 Laxfield Rd Weston MA 2493 1750000 5.0 6.0 Weston 9468.0 74487.0 2015.0 43.0 http://www.redfin.com/MA/Weston/20-Laxfield-Rd-02493/home/8782460 MLS PIN 62611696.0 42.373844 -71.306444
Single Family Residential 26 Fletcher Ave Lexington MA 2420 1450000 4.0 2.5 Lexington Center 3544.0 10018.0 2003.0 49.0 http://www.redfin.com/MA/Lexington/26-Fletcher-Ave-02420/home/8565173 MLS PIN 62763648.0 42.448571 -71.2218033
Single Family Residential 36 Vernon St Brookline MA 2446 1667000 6.0 3.0 None 3894.0 5560.0 1896.0 nan http://www.redfin.com/MA/Brookline/36-Vernon-St-02446/home/11467450 None nan 42.3388417 -71.1230621
Condo/Co-op 1 Avery St Unit 18D Boston MA 2111 1450000 2.0 2.0 Midtown 1135.0 1135.0 2000.0 50.0 http://www.redfin.com/MA/Boston/1-Avery-St-02111/unit-18D/home/9312500 MLS PIN 62314937.0 42.3532598 -71.0625369

Simple visualization using bar charts

With PixieDust display(), you can visually explore the loaded data using built-in charts, such as, bar charts, line charts, scatter plots, or maps.

To explore a data set:

  • choose the desired chart type from the drop down
  • configure chart options
  • configure display options

You can analyze the average home price for each city by choosing:

  • chart type: bar chart
  • chart options
    • Options > Keys: CITY
    • Options > Values: PRICE
    • Options > Aggregation: AVG

Run the next cell to review the results.

In [10]:
display(homes)
Hey, there's something awesome here! To see it, open this notebook outside GitHub, in a viewer like Jupyter
Average home price by city

Explore the data

You can change the display Options so you can continue to explore the loaded data set without having to pre-process the data.

For example, change:

  • Options > Key to YEAR_BUILT and
  • Options > aggregation to COUNT

Now you can find out how old the listed properties are:

In [11]:
display(homes)
Hey, there's something awesome here! To see it, open this notebook outside GitHub, in a viewer like Jupyter
Property age

Use sample data sets

PixieDust comes with a set of curated data sets that you can use get familiar with the different chart types and options.

Type pixiedust.sampleData() to display those data sets.

In [12]:
pixiedust.sampleData()
Id Name Topic Publisher
1 Car performance data Transportation IBM
2 Sample retail sales transactions, January 2009 Economy & Business IBM Cloud Data Services
3 Total population by country Society IBM Cloud Data Services
4 GoSales Transactions for Naive Bayes Model Leisure IBM
5 Election results by County Society IBM
6 Million dollar home sales in Massachusetts, USA Feb 2017 through Jan 2018 Economy & Business Redfin.com
7 Boston Crime data, 2-week sample Society City of Boston

The homes sales data set you loaded earlier is one of the samples. Therefore, you could have loaded it by specifying the displayed data set id as parameter: home = pixiedust.sampleData(6)

If your data isn't stored in csv files, you can load it into a DataFrame from any supported Spark data source. See these Python code snippets for more information.

End of chapter. Return to table of contents


Mix Scala and Python on the same notebook

Python has a rich ecosystem of modules including plotting with matplotlib, data structure and analysis with pandas, machine learning, and natural language processing. However, data scientists working with Spark might occasionally need to call out code written in Scala or Java, for example, one of the hundreds of libraries available on spark-packages.org. Unfortunately, Jupyter Python notebooks do not currently provide a way to call out Scala or Java code. As a result, a typical workaround is to first use a Scala notebook to run the Scala code, persist the output somewhere like a Hadoop Distributed File System, create another Python notebook, and re-load the data. This is obviously inefficent and awkward.

As you'll see in this notebook, PixieDust provides a solution to this problem by letting users write and run scala code directly in its own cell. It also lets variables be shared between Python and Scala and vice-versa.

Define a few simple variables in Python

In [13]:
pythonString = "Hello From Python"
pythonInt = 20

Import the PixieDust module

If you haven't already, import PixieDust. Follow the instructions in Get started.

Use the Python variables in Scala code

PixieDust makes all variables defined in the Python scope available to Scala using the following rules:

  • Primitive types are mapped to the Scala equivalent: for example, Python Strings become Scala Strings, Python Integer become Scala Integer, and so on.
  • Some complex types are mapped as follows: PySpark SQLContext, DataFrame, RDD are mapped to their Scala Spark equivalents. Python GraphFrames mapped to their Scala equivalents. PixieDust will add more mapping as needed.
  • Python classes are currently not converted and therefore cannot be used in Scala.

The PixieDust Scala Bridge requires the environment variable SCALA_HOME to be defined and pointing at a Scala install:

In [14]:
%%scala
print(pythonString)
print(pythonInt + 10)
Hello From Python
30

Define a variable in Scala and use it in Python

In this section, you'll create a variable in Scala and use it in Python.

Note: only variables that are prefixed with two underscores ( __ ) are available for use in Python.

In [15]:
%%scala
val __scalaString = "Hello From Scala"
val __scalaInt = 5
In [16]:
# using Scala variable in Python
print __scalaString
print __scalaInt + 10
Hello From Scala
15

In this chapter, you've seen how easy it is to intersperse Scala and Python in the same notebook. Continue exploring this powerful functionality by using more complex Scala libraries!

End of chapter. Return to table of contents


Add Spark packages and run inside your notebook

PixieDust PackageManager helps you install spark packages inside your notebook. This is especially useful when you're working in a hosted cloud environment without access to configuration files. Use PixieDust Package Manager to install:

  • a spark package from spark-packages.org
  • a package from the Maven search repository
  • a jar file directly from URL

Note: After you install a package, you must restart the kernel and import Pixiedust again.

View list of packages

To see the packages installed on your system, run the following command:

In [17]:
import pixiedust
pixiedust.printAllPackages()
graphframes:graphframes:0.5.0-spark2.1-s_2.11 => /gpfs/fs01/user/sf9b-795b2b888c32b6-772f4e1cd93d/data/libs/graphframes-0.5.0-spark2.1-s_2.11.jar
com.typesafe.scala-logging:scala-logging-api_2.11:2.1.2 => /gpfs/fs01/user/sf9b-795b2b888c32b6-772f4e1cd93d/data/libs/scala-logging-api_2.11-2.1.2.jar
com.typesafe.scala-logging:scala-logging-slf4j_2.11:2.1.2 => /gpfs/fs01/user/sf9b-795b2b888c32b6-772f4e1cd93d/data/libs/scala-logging-slf4j_2.11-2.1.2.jar
direct.download:https://github.com/ibm-watson-data-lab/spark.samples/raw/master/dist/streaming-twitter-assembly-1.6.jar:1.0 => /gpfs/fs01/user/sf9b-795b2b888c32b6-772f4e1cd93d/data/libs/streaming-twitter-assembly-1.6.jar

Add a package from spark-packages.org

The command you use to install GraphFrames depends on your Spark version.

In [18]:
if sc.version.startswith('1.6.'):  # Spark 1.6
    pixiedust.installPackage("graphframes:graphframes:0.5.0-spark1.6-s_2.11")
elif sc.version.startswith('2.'):  # Spark 2.1, 2.0
    pixiedust.installPackage("graphframes:graphframes:0.5.0-spark2.1-s_2.11")


pixiedust.installPackage("com.typesafe.scala-logging:scala-logging-api_2.11:2.1.2")
pixiedust.installPackage("com.typesafe.scala-logging:scala-logging-slf4j_2.11:2.1.2")
Package already installed: graphframes:graphframes:0.5.0-spark2.1-s_2.11
Package already installed: com.typesafe.scala-logging:scala-logging-api_2.11:2.1.2
Package already installed: com.typesafe.scala-logging:scala-logging-slf4j_2.11:2.1.2
Out[18]:
<pixiedust.packageManager.package.Package at 0x7f83f5cdee50>

Note: After you install a package, you must restart the kernel and import Pixiedust again. You'll also need to run pixiedust.installPackage again before that package can be used. You can do this by running the two code cells above again after you have restarted the kernel.

View the updated list of packages

Run printAllPackages again to see that GraphFrames is now in your list:

In [19]:
pixiedust.printAllPackages()
graphframes:graphframes:0.5.0-spark2.1-s_2.11 => /gpfs/fs01/user/sf9b-795b2b888c32b6-772f4e1cd93d/data/libs/graphframes-0.5.0-spark2.1-s_2.11.jar
com.typesafe.scala-logging:scala-logging-api_2.11:2.1.2 => /gpfs/fs01/user/sf9b-795b2b888c32b6-772f4e1cd93d/data/libs/scala-logging-api_2.11-2.1.2.jar
com.typesafe.scala-logging:scala-logging-slf4j_2.11:2.1.2 => /gpfs/fs01/user/sf9b-795b2b888c32b6-772f4e1cd93d/data/libs/scala-logging-slf4j_2.11-2.1.2.jar
direct.download:https://github.com/ibm-watson-data-lab/spark.samples/raw/master/dist/streaming-twitter-assembly-1.6.jar:1.0 => /gpfs/fs01/user/sf9b-795b2b888c32b6-772f4e1cd93d/data/libs/streaming-twitter-assembly-1.6.jar

Display a GraphFrames data sample

Even if GraphFrames is already installed, running the install command loads the Python that comes along with the package. Run the following cell and PixieDust displays a sample graph data set. On the upper left of the display, click the table dropdown and switch between views of nodes and edges.

In [21]:
from graphframes import GraphFrame

try:
    sqlcontext = SparkSession.builder.getOrCreate()
except:
    sqlcontext = SQLContext(sc)

# Vertex DataFrame
v = sqlcontext.createDataFrame([
  ("a", "Alice", 34),
  ("b", "Bob", 36),
  ("c", "Charlie", 30),
  ("d", "David", 29),
  ("e", "Esther", 32),
  ("f", "Fanny", 36),
  ("g", "Gabby", 60)
], ["id", "name", "age"])

# Edge DataFrame
e = sqlcontext.createDataFrame([
  ("a", "b", "friend"),
  ("b", "c", "follow"),
  ("c", "b", "follow"),
  ("f", "c", "follow"),
  ("e", "f", "follow"),
  ("e", "d", "friend"),
  ("d", "a", "friend"),
  ("a", "e", "friend")
], ["src", "dst", "relationship"])

# Create a GraphFrame
g = GraphFrame(v, e)

display(g)
Hey, there's something awesome here! To see it, open this notebook outside GitHub, in a viewer like Jupyter

Install from Maven

To install a package from the Apache Maven search repository, visit the project and find the groupId and artifactId for the package that you want. Enter them in the following installation command. See instructions for the installPackage command. For example, the following cell installs Apache Commons:

In [22]:
pixiedust.installPackage("org.apache.commons:commons-csv:0")
Downloading package org.apache.commons:commons-csv:1.5 to /gpfs/fs01/user/sf9b-795b2b888c32b6-772f4e1cd93d/data/libs/commons-csv-1.5.jar
Starting download...
Package org.apache.commons:commons-csv:1.5 downloaded successfully
Please restart Kernel to complete installation of the new package
Successfully added package org.apache.commons:commons-csv:1.5
Out[22]:
<pixiedust.packageManager.package.Package at 0x7f85129c9310>
In [ ]:
# PT
pixiedust.printAllPackages()

Install a jar file directly from a URL

To install a jar file that is not packaged in a maven repository, provide its URL.

In [23]:
pixiedust.installPackage("https://github.com/ibm-watson-data-lab/spark.samples/raw/master/dist/streaming-twitter-assembly-1.6.jar")
Package already installed: https://github.com/ibm-watson-data-lab/spark.samples/raw/master/dist/streaming-twitter-assembly-1.6.jar
Out[23]:
<pixiedust.packageManager.package.Package at 0x7f83f5cd0290>

Follow the tutorial

To understand what you can do with this jar file, read David Taieb's latest Realtime Sentiment Analysis of Twitter Hashtags with Spark tutorial.

Uninstall a package

It's just as easy to get rid of a package you installed. Just run the command pixiedust.uninstallPackage("<<mypackage>>"). For example, you can uninstall Apache Commons:

In [24]:
pixiedust.uninstallPackage("org.apache.commons:commons-csv:0")
Successfully deleted package org.apache.commons:commons-csv:1.5

Restart the kernel and import pixiedust

After uninstalling a package the restart kernel and import pixiedust before continuing.

In [1]:
# import pixiedust after restarting kernel
import pixiedust
Pixiedust database opened successfully
Pixiedust version 1.1.9

End of chapter. Return to table of contents


Stash Your Data

With PixieDust, you also have the option to export the data from your notebook to external sources. The output of the display API includes a toolbar that contains a Download button.

Stash to Cloudant

You save the data directly into a Cloudant or CouchDB database.

Prerequisite: Collect your database connection information: the database host, user name, and password.

If your Cloudant instance was provisioned in IBM Cloud, you can find the connectivity information in the Service Credentials tab.

To stash to Cloudant:

  1. From the toolbar in the display output, click the Download button.
  2. Choose Stash to Cloudant from the menu.
  3. Click the dropdown to see the list of available connections and select an existing connection or add a new connection:
    1. Click the + plus button to add a new connection.
    2. Enter your Cloudant database credentials in JSON format.
    3. If you are stashing to CouchDB, include the protocol. See the sample credentials format below.
    4. Click OK.
    5. Select the new connection.
  4. Click Submit.

Sample Credentials Format

CouchDB

{
    "name": "local-couchdb-connection",
    "credentials": {
        "username": "couchdbuser",
        "password": "password",
        "protocol": "http",
        "host": "127.0.0.1:5984",
        "port": 5984,
        "url": "http://couchdbuser:password@127.0.0.1:5984"
    }
}

Cloudant

{
    "name": "remote-cloudant-connection",
    "credentials": {
        "username": "username-ibmcloud",
        "password": "password",
        "host": "host-ibmcloud.cloudant.com",
        "port": 443,
        "url": "https://username-ibmcloud:password@host-ibmcloud.cloudant.com"
    }
}

Download as a file

Alternatively, you can choose to save the data set to various file formats (for example, CSV, JSON, XML, and so on).

To save a data set as a file:

  1. From the toolbar in the display output, click the Download button.
  2. Choose Download as File.
  3. Choose the desired format.
  4. Specify the number of records to download.
  5. Click OK.

End of chapter. Return to table of contents


Contribute

By now, you've walked through PixieDust's intro notebooks and seen PixieDust in action. If you like what you saw, join the project!

Anyone can get involved. Here are some ways you can contribute:

Write a visualization

Contribute your own custom visualization. Here's a taste of how it works.

Run the next 4 cells to do the following:

  1. Import PixieDust.
  2. Generate a sample DataFrame.
  3. Create a custom table display option called NewSample.
  4. Display the DataFrame and see your new custom option under the Table dropdown menu.

This is just one small example you can quickly do within this notebook. Read how to create a custom visualization.

In [2]:
import pixiedust

Now, create a simple DataFrame:

In [3]:
sqlContext=SQLContext(sc)
d1 = spark.createDataFrame(
[(2010, 'Camping Equipment', 3),
 (2010, 'Golf Equipment', 1),
 (2010, 'Mountaineering Equipment', 1),
 (2010, 'Outdoor Protection', 2),
 (2010, 'Personal Accessories', 2),
 (2011, 'Camping Equipment', 4),
 (2011, 'Golf Equipment', 5),
 (2011, 'Mountaineering Equipment',2),
 (2011, 'Outdoor Protection', 4),
 (2011, 'Personal Accessories', 2),
 (2012, 'Camping Equipment', 5),
 (2012, 'Golf Equipment', 5),
 (2012, 'Mountaineering Equipment', 3),
 (2012, 'Outdoor Protection', 5),
 (2012, 'Personal Accessories', 3),
 (2013, 'Camping Equipment', 8),
 (2013, 'Golf Equipment', 5),
 (2013, 'Mountaineering Equipment', 3),
 (2013, 'Outdoor Protection', 8),
 (2013, 'Personal Accessories', 4)],
["year","zone","unique_customers"])

The following cell creates a new custom table visualization plugin called NewSample:

In [4]:
from pixiedust.display.display import *

class TestDisplay(Display):
    def doRender(self, handlerId):
        self._addHTMLTemplateString(
"""
NewSample Plugin
<table class="table table-striped">
    <thead>                 
        {%for field in entity.schema.fields%}
        <th>{{field.name}}</th>
        {%endfor%}
    </thead>
    <tbody>
        {%for row in entity.take(100)%}
        <tr>
            {%for field in entity.schema.fields%}
            <td>{{row[field.name]}}</td>
            {%endfor%}
        </tr>
        {%endfor%}
    </tbody>
</table>
"""
        )

@PixiedustDisplay()
class TestPluginMeta(DisplayHandlerMeta):
    @addId
    def getMenuInfo(self,entity,dataHandler):
        if entity.__class__.__name__ == "DataFrame":
            return [
                {"categoryId": "Table", "title": "NewSample Table", "icon": "fa-table", "id": "newsampleTest"}
            ]
        else:
            return []
    def newDisplayHandler(self,options,entity):
        return TestDisplay(options,entity)

Next, run display() to show the data. Click the Table dropdown. You now see NewSample Table option, the custom visualization you just created!

In [5]:
display(d1)
Hey, there's something awesome here! To see it, open this notebook outside GitHub, in a viewer like Jupyter

Error? If you changed the name yourself in cell 3, you might get an error when you try to display. You can fix this by updating metadata in the display() cell. To do so, go to the Jupyter menu above the notebook and choose View > Cell Toolbar > Edit Metadata. Then scroll down to the display(dl) cell, click its Edit Metadata button and change the handlerID.

Build a renderer

PixieDust lets you switch between renderers for charts and maps. We'd love to add more to the list. It's easy to get started. Try the generate tool to create a boilerplate renderer using a quick CLI wizard. Read how to build a renderer.

Enter an issue

Found a bug? Thought of great enhancement? Enter an issue to let us know. Tell us what you think.

Share PixieDust

If you think someone you know would be interested in PixieDust, spread the word:

Learn more

Ready to pitch in? We can't wait to see what you share. More on how to contribute.

End of chapter. Return to table of contents

Authors

  • Jose Barbosa
  • Mike Broberg
  • Inge Halilovic
  • Jess Mantaro
  • Brad Noble
  • David Taieb
  • Patrick Titzler


Copyright © IBM Corp. 2017, 2018. This notebook and its source code are released under the terms of the MIT License.