Data and the FAIR Principles

Lesson 4: Your Laboratory Datastore

Overview

Teaching: Self-paced min
Exercises: 0 min
Questions
  • What resources are available for me to be a good steward of my laboratory’s data

Objectives
  • Learn about resources a laboratory can utilize (off the shelf) to be a good steward of their data

  • Learn about different databasing options if a custom solution is desired

Introduction

This lesson provides information and training on a number of research data management platforms. In addition, this lesson provides a number of courses covering an introduction to databases as well as specific course offerings for some popular database platforms.

Introduction

When managing data in your own laboratory, there are a number of options available to you. First, there are a number of products (both academic and commercial) that allow you to manage your neuroimaging and associated data within their platform.

Selected External Material

Neuroimaging Data Platforms

XNAT:

Overview: XNAT is an open source imaging informatics platform developed by the Neuroinformatics Research Group at Washington University. XNAT was originally developed in the Buckner Lab at Washington University, now at Harvard University. It facilitates common management, productivity, and quality assurance tasks for imaging and associated data. Thanks to its extensibility, XNAT can be used to support a wide range of imaging-based projects.

Documentation: Full documentation for XNAT can be found on their Wiki page (https://wiki.xnat.org/documentation)

LORIS:

Overview: The Longitudinal Online Research and Imaging System is a web-based data and project management software for neuroimaging research studies. It is an OPEN SOURCE framework for storing and processing behavioural, clinical, neuroimaging and genetic data. LORIS also makes it easy to manage large datasets acquired over time in a longitudinal study, or at different locations in a large multi-site study.

Documentation: Documentation for LORIS can be found on their GitHub Wiki page (https://github.com/aces/Loris/wiki/Setup)

FlyWheel (Commercial):

Overview: Flywheel is a data management platform designed to ease the IT burden of the researcher by creating a collaborative environment for reproducible, computational science. Data can be uploaded directly from devices or can be manually uploaded into Flywheel. Once loaded, users can organize and search through data.

Documentation: Documentation for FLyWheel can be found on their documentation page (https://docs.flywheel.io)

Build your Own…

While it is recommended that one try and utilize (and potentially contribute to) an existing platform there may still be a need to develop your own database system. Databases and database technology are well represented in current online course offerings. Therefore, we provide below a set of recommendations for free online courses that you can utilize.

Selected External Material

Introduction to Databases

Databases Self Paced MOOC at Stanford Online:

Abstract: Databases are incredibly prevalent – they underlie technology used by most people every day if not every hour. Databases reside behind a huge fraction of websites; they’re a crucial component of telecommunications systems, banking systems, video games, and just about any other software system or electronic device that maintains some amount of persistent information. In addition to persistence, database systems provide a number of other properties that make them exceptionally useful and convenient: reliability, efficiency, scalability, concurrency control, data abstractions, and high-level query languages. Databases are so ubiquitous and important that computer science graduates frequently cite their database class as the one most useful to them in their industry or graduate-school careers. (content from course site)

Introduction to Relational Databases at Udacity:

Abstract: This course is a quick, fun introduction to using a relational database from your code, using examples in Python. You’ll learn the basics of SQL (the Structured Query Language) and database design, as well as the Python API for connecting Python code to a database. You’ll also learn a bit about protecting your database-backed web apps from common security problems. (content from course site)

Introduction to NoSQL and Database as a Service at CognitiveClass:

Abstract: Get technical hands-on knowledge of NoSQL (Non-SQL or Not-only-SQL) databases and Database-as-a-Service (DBaaS) offerings. With the advent of Big Data and agile development methodologies, NoSQL databases have gained a lot of relevance. Their main advantage is the ability to handle effectively scalability and flexibility issues for modern applications.. (content from course site)

Lessons for Specific Database Platforms

MariaDB:

Abstract: MariaDB, the successor to MySQL, is an open-source relational database. They provide a collection of online learning resources.

MongoDB:

Abstract: MongoDB provides a collection of free lessons covering all aspects of the MongoDB platform.

Neo4J:

Abstract: Neo4J provides a number of training materials online.

Key Points