1/183
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Research Data
The recorded factual material commonly accepted in the scientific community as of sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications.
Research Integrity
An umbrella term that covers the use of honest and verifiable methods in proposing, performing, and evaluating research; reporting research results with particular attention to adherence to rules, regulations, guidelines; and following commonly accepted professional codes or norms.
Reproducibility
Efforts and strategies that are generally concerned with establishing the credibility, reliability, and validity of scientific research.
Methods Reproducibility
The provision of enough detail about study procedures and data so the same procedures could, in theory (or in actuality) be exactly repeated.
Results Reproducibility
Refers to obtaining the same results from the conduct of an independent study whose procedures are as closely matched to the original experiment as possible. Also called replicability.
Robustness
The stability of experimental conclusions to variations in either baseline assumptions or experimental procedures.
Generalizability
The persistence of an effect in settings different from and outside of an experimental framework.
Data Provenance
The documented trail that describes the origin of a piece of data as well as how it has been processed and transformed over time.
Data Management
The process of validating, organizing, protecting, maintaining, and processing scientific data to ensure the accessibility, reliability, and quality of the scientific data for its users.
Data Sharing
The act of making scientific data available for use by others (e.g., the larger research community, institutions, the broader public), for example, via an established repository.

Data Quality
A broad concept that refers to the degree to which a set of data is fit for its intended purpose. Related to data management, but also includes topics involving methodological rigor.
Data Usability
The ability to open, understand, make use of, and build upon a set of data. 'Reuse' encompasses many potential activities, including using a dataset for education and training (of both human researchers and algorithms), testing new hypotheses (which can involve combining multiple extant datasets), and more.
Disorganization
A way to lose data that refers to the lack of structured organization of data.
Missing documentation/metadata
A way to lose data that refers to the absence of necessary documentation that describes the data.
Failure of storage media
A way to lose data that occurs when the physical devices used to store data fail.
Obsolescence
A way to lose data that refers to the outdated nature of technology or formats that can no longer be accessed.
Improper archiving
A way to lose data that occurs when data is not archived correctly, leading to potential loss.
Foundational to ensuring data integrity
The role of data management in maintaining the accuracy and consistency of data over its lifecycle.
Collaborative efforts in data management
The shared responsibility among all individuals working with research data to ensure it is properly managed.
Primary investigator's responsibility
The ultimate accountability for data management and sharing lies with the primary investigator.

Transparency in research
The principle that proper data management is essential for ensuring openness and reproducibility in the research process.
Requirement of proper data management
The necessity of effective data management practices to facilitate efficient, collaborative, and rigorous research.
Preventing information loss
The role of proper data management in ensuring that materials are not lost and research can proceed efficiently.
Motivations and Expectations
The reasons behind the efforts in data management and sharing.
Discussion on losing data
A conversation about the various ways in which data can be lost.
Practices and strategies in data management
The methods employed to support the quality and usability of research data over time.
Documentation in data management
The importance of maintaining records that describe the data for future reference and usability.
Storage in data management
The methods and practices for securely holding data to prevent loss.
Project Leadership
Empower others to manage research data well.
Setting policy
Responding to feedback.
Communicating practices
Communicating practices and procedures to project collaborators.
Monitoring and auditing
Monitoring and auditing data management practices.
Implementing standardized practices
Implementing standardized practices and procedures.
Providing feedback
Providing feedback to leadership.
Asking questions
Asking questions to clarify data management processes.
Project Team
Contribute to the broad practice of data management.
The Stanford Data Retention Policy
A policy outlining data retention requirements.
The NIH Data Management and Sharing Policy
A policy that governs data management and sharing practices.

Potential consequence of losing research data
A. The research findings based on the data may be called into question. B. Financial penalties from funding agencies and/or study sponsors. C. Research activities must immediately cease. D. All of the above.
FAIR Guiding Principles
Principles that guide data management practices.
Findable
Data be easy to find by both humans and computers (e.g. use of standardized file names, persistent identifiers).
Accessible
There is a clearly defined method for accessing the data (e.g. use of standardized file organization, data storage and backups).
Interoperable
Data should be usable across a range of applications and workflows (e.g. use of standards, common file formats).
Reusable
Data should be saved, organized, and described with its future (re)use in mind, even if there are not plans to share it openly.
Mischievous Meatball
A scenario where Meatball the cat accidentally knocks your computer off a table, irrevocably damaging its hard drive.
Saving Data
Know where data can be saved.
Sensitive data
Data that must be protected against unauthorized access.
Policies and regulations
Require that appropriate administrative, physical and technical safeguards be taken to ensure the confidentiality, integrity and security of certain information (e.g. HIPAA's 'security rule').
3-2-1 rule
Maintain multiple backups whenever possible.
Working storage vs Long term storage
They are not the same thing.
Excel limitations
Excel is limited in that it changes everything to dates, has compatibility issues between versions, lacks audit trails and straightforward version control, and calculations are largely invisible.

Staying Organized
Making finding things easy by keeping project-related files in project folders or directories that have a standard structure.
Standardized naming conventions
Maintain standardized naming conventions for both files and the contents within files (i.e., variable names).
Data dictionary or codebook
Names should be recorded in a data dictionary or codebook.
ReadMe files
Files that provide details about the contents of a dataset or a collection of related files.
Dryad
An example of a data repository that requires a ReadMe to be uploaded with any shared data.
File organization description
A short description of how related files are organized (e.g. directory, subdirectory structures).
File contents description
A short description of what each file contains.
File relationships description
A short description of the relationships between different files (e.g. versions, linked files).
Plain text file
The recommended format for writing a README file (e.g. project-name_readme.txt).
Project name
The name of the project that should be included in the ReadMe.
Dataset authors
Individuals who created the dataset and should be credited in the ReadMe.
Data citation
Recommended citation format for the dataset included in the ReadMe.
License information
Details about the licensing of the dataset included in the ReadMe.
Applicable grant IDs
Grant identifiers that should be included in the ReadMe.
Citations
References to related papers, code, etc., that should be included in the ReadMe.
File naming convention
A structured way to name files, such as [Experiment Name]_[Your Name]_[Description]_[YYYY-MM-DD].
Data type
The form in which data is collected and/or organized (e.g. spreadsheets, images, etc).
Data size
The magnitude of the data, such as the number of participants or approximate file size.
Data restrictions
Reasons for protecting data and/or limiting how it is shared (e.g. IP-related concerns).
Data sensitivity
The risks of disclosure and the policies, laws, and regulations that apply.
Identifying information
Information that can identify an individual, such as names and geographic subdivisions smaller than a state.
Personally identifiable information (PII)
Information that can be used to identify a person.
Private Information
Information that a person could reasonably expect would not be shared.
Confidentiality
The protection of private information from unauthorized access.
Vehicle identifiers and serial numbers
Identifiers related to vehicles, including license plate numbers.
Device identifiers and serial numbers
Identifiers related to devices, including serial numbers.
Web Universal Resource Locators (URLs)
Addresses used to access resources on the internet.
Internet Protocol (IP) address numbers
Numerical labels assigned to devices connected to a computer network.
Regulated Data
Information that is protected by local, national, or international statute or regulation mandating certain restrictions.
Biometric identifiers
Unique biological characteristics used for identification, including finger and voice prints.
Full face photographs
Images capturing the entire face of an individual.
Unique identifying number, characteristic, or code
Any other distinct identifier that can be used to recognize an individual.
Protected health information (PHI)
Information that can be linked to a particular person generated in the course of healthcare.
Data Management Plans (DMP)
Plans outlining how data will be managed and shared in research.
Data Types
A description of data that will be managed and shared.
Software
An outline of any specialized software needed to use the data.
Standards
The standards that will be applied to ensure usability/interoperability.
Preservation and sharing
How data will be made available to others, such as through a data repository.
Restrictions on sharing
Limits on sharing data, such as participant privacy.
Standard Operating Procedure (SOP)
A set of step-by-step instructions compiled by an organization to help workers carry out routine operations.
Data validation and quality assurance measures
Processes to ensure the accuracy and quality of data collected.
Data manipulations from raw to final data
Transformations applied to raw data to prepare it for analysis.
Documentation of process
Records of protocols, standard operating procedures, and methodologies.
Documentation of content
Records such as data dictionaries and codebooks that explain the data.
Field notebooks
Notebooks used for recording observations and data in the field.
Lab notebooks
Notebooks used for documenting experiments and research findings.
Codes for missing values
Designations used in datasets to indicate missing data points.
Collection Confusion
Issues arising from inconsistent data collection methods by different researchers.
Good Clinical Practice (GCP) Guidelines
Detailed, written instructions to achieve uniformity of the performance of a specific function.