Introduction to this course


Figure 1

Two people with computational expertise holding a giant book towards two other people who conduct lab experiments. The book saya: how to apply data science in biology.

Better and faster research !


Figure 1

Researchers represented in a map indicating their journey to understand and apply computational approaches. Some may have just started their journey, some may have come far in the learning and some may have gained proficiency based on their research requirements.

Figure 2

Researchers pour water on a tree, the water represents data science, the tree is the research.

Figure 3

Shows a landscape with different checkpoints fpr data, code, tools and result each of which require reproducible practices. There is a woman explaining her reproducibility journey to help new people start their journey

Figure 4

A house representing machine learing and AI is set upon bricks that one person is sliding below the house. On the bricks, we can read data science principles like open science, backups, reproducibiliy, and FAIR principles.

Figure 5

drawing

What is special in data science project ?


Figure 1

Specicity of data science project. Five blocks (working online, large teams whose members have with specialised skills, writing code and re-using code) are placed around a central block where reproducible analysis is written. Data specifics by Julien Colomb CC-BY 4.0

Reproducibility


Figure 1

A matrix showing data and analysis in two axis and iterating that reproducibility is when same analysis is applied to same data it gives same result.

Figure 2

Ways of capturing computational environments

An introduction to version control


Figure 1

Contrast in project history management. On the left - choosing between ambiguosly named files. On the right - picking between successive versions (from V1 to V6).

Setting up a computational project


Figure 1

The research process is represented as a perpetual cycle of generating research ideas, performing data planning and design, data collection, and data processing and analysis, publishing, preserving and hence, allowing re-use of data.

Implementing tools and methods during the project


Figure 1

drawing

Figure 2

drawingA traditional Kanban for a collaborative computational project. Keeping track of bugs and what everyone is working on.


Figure 3

drawing

Figure 4

drawing

Research Data Management


Fostering documentation


Figure 1

Image shows a person putting lamp-posts of documentation, helping a researcher who was lost because of lack of information about the research.

Scientific rigour with code


Coding basics


Figure 1

drawing

Figure 2

drawing

Figure 3

drawing

Figure 4

box plots

Figure 5

box plots

Figure 6

box plots

Figure 7

box plots

Figure 8

box plots

Code testing and Review


Figure 1

The Turing Way project illustration by Scriberia. Used under a CC-BY 4.0 licence. DOI: 10.5281/zenodo.3332807.


Figure 2

Garden of code

Figure 3

drawing

Figure 4

drawing

Figure 5

drawing

Figure 6

drawing

Figure 7

Continuous Integration with GitHub ActionsThe Turing Way project illustration by Scriberia. Used under a CC-BY 4.0 licence. DOI: 10.5281/zenodo.3332807.


Code Modularity


Figure 1

drawing

Figure 2

drawing

Publication and release


Figure 1

drawing

Figure 2

drawing

Figure 3

You can release code and data associated with a research article as a series of files/folders. If your project follows the folder template introduced in a previous episode, for example: drawing


Figure 4

drawing

Figure 5

drawing

Figure 6

drawing

Figure 7

drawing

Figure 8

drawing

Figure 9


Figure 10

drawing

Figure 11

drawing

Figure 12

Extensive tools to annotate and view images, including whole slide & microscopy images. Interactive machine learning for both object & pixel classification. drawing


Open Science Practices


Figure 1

Image shows a person having internal debate about open vesus closed research. Open means new opportunities and inclusivity but closed maybe required to ensure data sensitivity or wrongly assumed for funding for novel work.

Figure 2

Image shows a woman slowly gaining trust and confidence in opening up her research project and benefitting from open collaboration

Data and code citation