Python homework

Overview of the Assignment:
In this assignment, you will use some analysis operations on Big Data from Wikimedia. The dataset is the 1.2 GB clickstream from February 2015.
Part 1 – Setup
1.1 Download and save the homework files into the same folder:
Wikipedia Clickstream – Gettting Started-Extended.ipynb
London_Sankey.html
2015_02_filtered_en_clickstream.tsv
1.2 Open Wikipedia Clickstream – Gettting Started-Extended.ipynb in Jupyter notebook. Read and execute the first section. (No screen shots needed for this part.)
Part 2 – Getting to know the Data
2.1 Loading the Data: Run the three cells in the section labelled Loading the Data.
2.2 Top articles: Run the cell in the section labelled “Top articles”.
2.2.1 Describe what the Python code did.

2.2.2 Edit this cell to get only the 25th most referenced page and return only the (Wikipedia) name. Assign this to a variable named “views25” and display it in Jupyter. (Hint: use .index.values[0] on the grouped data.
? Take a screenshot for your Word report

2.3 Top Referers: Run the cell in the section labelled “Top Referers”
2.3.1 Describe what the Python code did.

2.3.2 Edit this cell to get the referring page with the 16th most referrals and return only the (Wikipedia) name. Assign this to a variable named “refers16” and display it in Jupyter.
? Take a screenshot for your Word report

2.4 Trending on Social Media: Run the cell in the section labelled “Trending on Socal Media”
2.4.1 Describe what the Python code did.

2.4.2 Edit this cell to get the 6th most trending article and return only the (Wikipedia) name. Assign this to a variable named “trend7” and display it in Jupyter.
? Take a screenshot for your Word report

2.5 Most Requested Missing Pages: Run the cell in the section labelled “Most Requested Missing Pages”
2.5.1 Describe what the Python code did.

2.5.2 Does the 16th most referring page (from question 2.3.2) have redlinks? In a new cell, search for places where “refers16” goes to a redlink. Use the variable created from question 2.3.2. (Hints: each term in the search must be in parentheses. The “and” operator is “&”.)
? Take a screenshot for your Word report

2.6 Searching Within Wikipedia: Run the cells in the section labelled “Searching Within Wikipedia”
2.6.1 Describe what the Python code did.

2.7 Inflow vs Outflow: Run the cells in the section labelled “Inflow vs Outflow”
2.7.1 Describe what the Python code did.

2.7.2 What article has the 9th highest number of inflows?

2.7.3 What article has the 8th highest ratio of outflow to inflow?

2.8 Filtering traversals: Run the cells in the section labelled “Filtering traversals”
2.8.1 Describe what the Python code did.

2.8.2 Why would Wiki be at the top of the resulting list?

2.9 What is this code doing (#1)?: Run the cells in the section labelled “What is this code doing (#1)?
2.9.1 Describe what the Python code did.

2.10 What is this code doing (#1)?: Run the cells in the section labelled “What is this code doing (#1)?
2.10.1 Describe what the Python code did.

Part 3 – Simple Network Analysis
3.1 Run the cells in the section labelled “Simple Network Analysis”.
3.2 Review they types of graph classes supported by the network module at https://networkx.github.io/documentation/stable/reference/classes/index.html#which-graph-class-should-i-use

Why was the DiGraph the appropriate choice for the Wikipedia Clickstream data?

3.3 Using the networkx variable “clickstream”, try to find a path from Spencer_Tracy to Bette Davis. Show your work.

? Take a screenshot for your Word report

3 of 4

Place your order
(550 words)

Approximate price: $22

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
The price is based on these factors:
Academic level
Number of pages
Urgency
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more
Open chat
1
You can contact our live agent via WhatsApp! Via + 1 929 473-0077

Feel free to ask questions, clarifications, or discounts available when placing an order.

Order your essay today and save 20% with the discount code GURUH