Introduction to Data Python Data Analysis Projects


  • Projects have common structures
  • Packaging enables a project to be installed
  • An environment allows different people to all have the same versions and run software more reliably
  • Documentation is an essential component of nay complete project and should exist with the code

Setting up a Project


  • Data and code should be governed by different principles
  • A package enables a project to be installed
  • An environment allows different people to all have the same versions and run software more reliably
  • Documentation is an essential component of nay complete project and should exist with the code

Packaging Python Projects


  • Packaged code is reusable within and across systems
  • A Python package consists of modules
  • Projects can be distributed in many ways and installed with a package manager

Managing Python Environments with Conda


  • A python dependency is another, independent package that a given project uses and requires to be able to run
  • An environment is
  • An environment manager enables one step installing and documentation of dependencies, including versions
  • Conda is the included environment manager with Anaconda; it is also an installer
  • Other popular environment managers are FIXME

Managing Python Environments with VirtualEnv


  • A python dependency is another, independent package that a given project uses and requires to be able to run
  • An environment is
  • An environment manager enables one step installing and documentation of dependencies, including versions
  • Virtualenv is …

Getting started with Documentation


  • Documentation tells people how to use code and provides examples
  • Types of documentation include: literal, API, and tutorial/example
  • Literal Documentation lives outside the code and explains the big picture ideas of the project and how to get it ste up
  • API documentation lives in docstrings within the code and explains how to use functions in detail
  • Examples are scripts (or notebooks, or code excerpts) that live alongside the project and connect between the details and the common tasks.

Documentation in Code


  • Docstrings describe functions
  • comments throughout the code help onboard and debug

Building Documentation with Sphinx


  • Building documentation into a website is a common way of distributing it
  • Sphinx will auto build a website from plain text files and your docstrings

Publishing code and data


Testing and Continuous Integration