The Research Lifecycle at SOMAR
ChatGPT said:
Find SOMAR Data
Researchers can find SOMAR data in the SOMAR dataset search. Below are a few search tips to find the data you need.
Can I search for data by social media platform?
When SOMAR data come from a specific social media platform, we list the platform in the “External Data Sources” field. If available, filter by this field or use the platform name in keyword search to find results with the social media platform you’re looking for.
Can I search for data by the name of social media accounts?
Our dataset search looks for terms anywhere in the study metadata (title, summary, subject terms, etc.) but does not match search terms to content in data or documentation files. While some names of social media accounts may be found in the study metadata, you will find more results by searching for data by topic or other keywords. If you find a relevant dataset, the documentation files (such as READMEs or codebooks) often list what accounts or parameters of accounts are included in the data.
What if I can’t find the data I am looking for?
If you cannot find the social media data you’re looking for, email us at somar-help@umich.edu. We are happy to guide you to the data you need or help to find it elsewhere!
Public-Use vs. Restricted-Use Data
SOMAR provides access to different levels of data, from fully public to more sensitive information that requires extra safeguards.
Public-Use Data
- Available for download from the ICPSR website
- Data are de-identified and stripped of information that could identify individuals or organizations
- Researchers do not need to be ICPSR members to download public data from SOMAR
Restricted-Use Data
- Requires an approved application and may require a signed Restricted Data Use Agreement (RDUA)
- Includes more detailed information (like geographic specifics, timestamps, or demographic combinations) that could increase privacy risks if misused
- Access is provided through secure downloads or the Virtual Data Enclave (VDE), with strict security and usage requirements
- Codebooks and documentation are publicly downloadable from our catalog to help researchers decide if a restricted dataset is right for their research
Tip: Public-use data is faster to access but less detailed. Restricted-use data provides deeper insights but comes with higher responsibility.
Application Process
- Visit the SOMAR Data Access Portal
- Complete all required application fields
- Ensure you attach any required documentation (like Restricted Data Use Agreement or IRB approval) to avoid delays
- Submit your application
Tip: A strong application is complete, concise, and clearly outlines your research purpose, justification for access, and security plan.
How long does the application process usually take?
Approval typically takes up to 6 weeks. If the data you request is accessed in SOMAR’s Virtual Data Enclave (VDE), it can take up to an additional 10 business days following application approval for the secure environment to be created for researchers to use.
Characteristics of a strong application
While dataset has its own requirements, in general, a quality application includes the following:
- Clearly defined research objectives – Explain what you aim to study
- Practical methodology – Outline how you will analyze the data and achieve your objectives
-
Anticipated outputs – Describe the type of results, reports, or publications you plan to produce
-
Data protection measures – Detail how you’re safeguarding sensitive information
- Required documentation – Provide any necessary documents, such as a Restricted Data Use Agreement or IRB, signed by the appropriate individuals
- Complete and accurate information – Fill all required fields and provide information as requested, following format or domain requirements
1. Public Data
2. Controlled Download Data
Controlled Download is one way that SOMAR places access restrictions on sensitive or identifiable data. Data shared under the Controlled Download format require a brief application through the SOMAR data application portal. After your application is approved, SOMAR staff will provide a secure link to download the data. You may then analyze the data in your own computing environment.
Documentation files are publicly downloadable so researchers can learn more about the dataset and determine if it is right for their research before submitting an application.
3. Virtual Data Enclave (VDE)
Some datasets contain highly sensitive or identifiable information that require enhanced security measures. Accessing these datasets means conducting your analysis entirely within SOMAR’s secure Virtual Data Enclave (VDE). The Additional Notes field on the dataset will say whether the dataset is restricted to the SOMAR Virtual Data Enclave.
Documentation files are publicly downloadable so researchers can learn more about the dataset and determine if it is right for their research before submitting an application.
See the SOMAR VDE page for more information on the VDE environment.
Output Vetting
When researchers use restricted data in SOMAR’s Virtual Data Enclave (VDE), any output from their VDE workspaces must be vetted by SOMAR for disclosure risk of identifiable data before release and inclusion in publications or other research outputs. Follow the steps in SOMAR’s output vetting guidelines to prepare your results and request a review. These instructions are also available in the VDE.
Publishing Written Works with SOMAR Data
Researchers using SOMAR data in their written works must provide a bibliographic citation for each completed manuscript, thesis abstract, or other written work. This allows SOMAR and ICPSR to provide funding agencies, partners, and other stakeholders with essential information about the use of social media archive resources and to facilitate the exchange of information about researchers’ research activities, thus increasing the impact of social media research. Reference this FAQ page to learn about citing data.
Finishing Up Research in SOMAR’s Virtual Data Enclave
Access to restricted data for a research project at the University of Michigan officially begins once the Restricted Data Use Agreement (RDUA) is signed by the Lead Researcher or Lead Researcher’s Institutional Signatory. Typically, data access periods expire two years after an RDUA is signed, unless the project team requests an agreement renewal.
A research project working with VDE data is considered finished when all required documentation is submitted and the RDUA is formally closed. A team using VDE data nearing the end of their data access period, or requesting project closure (usually via somar-help@umich.edu), will finish their work in their VDE workspace and provide a final report (template provided at time of closing process) on their published or to-be-published written works; after this, U-M staff will complete the process to end the project.