The CSA Core provides research support services and methodological expertise at all stages of the funded research cycle to PRI/SSRI researchers. These services make it possible for researchers to conduct complex projects with collaborators across multi-disciplines and institutions. CSA staff have the expertise to support the development of data collection instruments, data acquisition, management, cleaning, documentation, and archiving; construction of complex data files using primary and secondary data; and statistical and spatial analyses. A unique strategy of the CSA Core is to provide a conduit to research resources that can be leveraged for large-scale projects that span multiple data sources and methodological approaches. CSA staff and directors have worked to develop appropriate access and platforms to reduce the startup costs and barriers to high performance computing (HPC) infrastructure and unique restricted access data sources that are available across Penn State and at other institutions. The CSA Core participates in research in a variety of domains but offers specific expertise in the following areas.

Big Data for population and social science research

The CSA Core has greatly expanded capacity to support Big Data for population and social science research in areas including data integration across multi- dimensions and scales, social media analytics, and innovative and nontraditional data collection and analysis. This will continue going forward with greater attention now devoted to skills necessary for those projects that rely on integrating multi-scale big data and processing within HPC environments available at Penn State. CSA is expanding its ability to work across resources and data infrastructure as more researchers collaborate on larger multidisciplinary projects that compile data from multiple sources.

Multi-dimensional and multi-scale data integration and analysis

This includes the integration and analysis of large spatial, historical, individual, and contextual datasets. To tackle increasingly complex population and societal problems, researchers increasingly need data that are often scattered across multiple sources, are at different scales, and come in a variety of formats; it requires knowledge, working experience, and methodological expertise to integrate and analyze the data effectively and efficiently. The Core has developed an expertise and created resources to meet these challenges by having worked with PRI and SSRI researchers on many projects. It supports the collection of intensive spatiotemporal data on individuals and contexts as well as the construction of contextual and ecological databases.

Social media analytics

Social media data provide significant opportunities for demographers and social scientists to study social problems and advance population science by providing massive real-time data. The Core has built an infrastructure for and expertise in collecting and analyzing social media data, including 50+ TB geotagged tweets for the entire world since 2013 and Facebook data.

Innovative and nontraditional data collection and analysis

Besides social media data, the Core also support the collection and analysis of other types of innovative and nontraditional data including commercial data, remote sensing data, mobile device data, and web scrapping data. 

Spatial analysis and statistics

The CSA Core helps researchers incorporate geographic information into their research. CSA Core staff specialize in spatial analysis, spatial statistics, exploratory spatial data analysis, and customized programming for geographic information systems (GIS) and online visualization. While the spatial component of the Core offers many services, the Core has the following primary foci:

  1. Innovation in the collection, handling, and utilization of spatiotemporal data by enhancing capacity and efficiencies in cutting-edge technologies;
  2. Maintaining best practices and ethical standards to ensure the privacy and confidentiality of georeferenced data,
  3. Developing the next generation of best practices as real-time, fine-grained data become available to researchers;
  4. Promoting the creative integration of spatiotemporal data, theories, and methods in population science; and
  5. Staying abreast of developments in GIS and advanced spatial analysis and statistics methods. 
  6. The Core’s geospatial data resources are extensive, including cleaned and easily usable public domain data from the U.S. Census and other federal, state, and local agencies. The Core also has expertise helping researchers harness open-source or commercial geospatial data. 

The CSA Core specializes in and advances spatial methods and works alongside researchers to make use of multilevel or hierarchical models (including multi-membership spatial data models), spatial econometric models, geographically weighted regression models, and spatial panel data methods. The geospatial data collected and analyzed are often derived from areal, point, and line data layers and their integration. Leveraging spatial relations and topographic linkages between data layers enables the construction of unique data products and/or the ability to examine individual-level outcomes in new geographies or place-based contexts. 

Support for data access and analyses of restricted and big data sources

CSA hosts restricted datasets that require stand-alone computers or secure server access. CSA provides an extensive array of services to ensure appropriate data access to individuals or larger group projects. In addition, CSA staff have developed procedures for supporting faculty projects housed in other secure data sites at Penn State.

To facilitate the processing and analysis of Big Data, CSA has closely worked with staff at Penn State ICDS’s Advanced CyberInfrastructure (ACI), a HPC infrastructure composed of more than 23,000 cores and 6 petabytes of storage. The Core received a Computational Resource Allocation Award with 50,000 SUs (service units, where 1 SU is equal to 1 core-hour on Bridges) and 40 TB storage for data management and analytics. The Core has the ability to add/remove accounts for the projects that staff are involved in.
 

Programming and statistics

The director and managing director work with staff to provide expert consultation on the selection and implementation of statistical methods, drawing on existing faculty expertise to offer consulting services to handle statistical problems that arise during project development, grant-proposal preparation, or over the course of funded projects. CSA also keeps PRI/SSRI faculty abreast of major developments in the statistical analysis of population data, enabling them to use the most powerful research designs and statistical methods appropriate for their demographic research. For more information about programming and statistics service, please refer to PRI’s Data and Analysis section.

Contact the CSA core at csa-info@psu.edu to discuss your needs and to learn how we can support your project.