Open Call for Research Projects

Are you a researcher with a research project that requires advanced computational skills?

If you have a research project that requires advanced computational skills, let us help you.

Tell us more about your research project and the type of skills that you need by filling out the Research Engagement intake form. One of our team members will be in touch to set up a consultation.

2025 Research projects

Facilitator: Karina Cahill
Project Title: HPC, Machine Learning, and Large Language Modeling Applications in Family-focused research and Prevention
Researcher: Kimberly Updegraff

Project Description: This project is focused on large language modeling AI machine learning approaches and analysis optimization using HPC of randomized clinical trial data from a preventative intervention program focused on family relationship dynamics. Specifically, in the field of prevention science an important line of research is program engagement and fidelity of an intervention (i.e., program implemented as intended).


Facilitator: Mustafa Demir
Project Title: Leveraging High-Performance Computing and Machine Learning to Improve Models of Sea Turtle Nesting and Abundance
Researcher: Sheila Miller Edwards

Project Description: This project uses high-performance computing to improve three existing models of sea turtle conservation: aerial-survey availability bias, nightly nesting abundance, and post-nesting migration behavior. By enabling large-scale simulations across vast combinations of biological and environmental parameters, HPC will strengthen GAM- and agent-based models to better estimate uncertainty, explain variability in nesting, and understand migration route selection. All three modeling efforts aim to complete comprehensive simulations and analyses to improve conservation decision-making for migratory megafauna.


Facilitator: Stevan Earl
Project Title: HPC workflows for processing large-scale environmental, biological and molecular datasets
Researcher: Amy Maas

Project Description: This research interrogates image and molecular datasets, along with environmental data, to analyze the drivers of organismal diversity in marine systems, and to explore the biogeochemical contributions of various taxonomic or morphological groups. These datasets have expanded to the extent that their desktop computers cannot run their computation pipelines. The goal of this project is to develop a plug and play tool that runs on Sol to take in image data with slightly variable naming conventions and run it through an R code to produce two output files (one that characterizes the sizes of organisms and one that provides similarity clusters). This same workflow needs to function for datasets that are generated from other instruments (with different metadata and conversion factors). Finally the resulting datasets needs to be able to be cross compared to molecular datasets, likely through a dissimilarity matrix. Desktop R codes (i.e. the VEGAN package) can’t handle this computational step so the use of Torch for R on the Sol supercomputer will be explored.


Facilitator: Brent Smith
Project Title: Online Data Feedback for ASU Compact X-ray light Sources using ML approaches
Researcher: Sabine Botha

Project Description: The unique beam properties of ASU’s Compact X-ray light source (CXLS) and Compact X-Ray Free electron Laser (CXFEL) promises to enable new ground-breaking research, such as serial crystallography of biomolecular targets, quantum materials research and AMO science. These experiments have made necessary the development of ML-based data feedback tools, such as an image classifier based on unwanted detector frames generated by the one-of-a-kind, for real time data reduction to manage the copious amounts of data collected at the full X-ray repetition rate of 1 kHz.  Therefore, multithreaded analysis of real-time streaming data, scaling from the current analysis of a single detector panel to evaluate all 4M-pixels of the Eiger area detector, would keep processing times and data storage capabilities manageable.


Facilitator: Trevor Whipple
Project Title: Large Language Models for Cyber Threat Intelligence
Researcher: Xusheng Xiao

Project Description: LLM4CTI develops a Large Language Model–based framework to extract structured cyber threat intelligence from unstructured articles, extending prior CTIKG work by using chunked LLM processing, a STIX-inspired schema, and knowledge graph construction to capture richer attacker behaviors. This project will further train Graph Neural Networks on these graphs for relationship prediction and community detection, supporting proactive and zero-day defense strategies. The extraction pipeline will be validated on an HPC environment, with documentation and authorship updates, and facilitation efforts will focus on workflow testing with real-world data integration.


Facilitator: Alan Chapman
Project Title: DETECT – De novo mutation Extraction Through Experimental Cutoff Thresholds
Researcher: Susanne Pfeifer

Project Description: Germline de novo mutation rates are central to evolutionary biology, but estimates vary widely across studies due to methodological differences, underscoring the need for standardized best practices in computational pipelines. Recent work quantified how filtering criteria and sequencing coverage affect mutation detection and led to the development of DETECT, a software package that provides study-specific best-practice recommendations and outperforms existing tools in benchmarking. However, DETECT is currently slow and resource-intensive, motivating a project to optimize, parallelize, and improve its usability with CI facilitation.

2024 Research projects

Facilitator: Deshon Miguel
Project Title: Enhanced Models for Antibody Epitope Prediction
Researcher: Neal Woodbury

Project Description: Peptide array-based antibody molecular recognition profiles can be used to train machine learning models to predict the amino acid sequences most involved in antibody recognition. This project will incorporate docking software and postprocessing data analysis into this lead binding site validation process. The goal will be to provide relative estimated binding levels of the off-target sequences and structural regions relative to the actual target sites. Workflows developed in the Sol shell environment will apply shell scripting to implement the preprocessing, docking, and postprocessing stages. Additionally, Job Arrays will be employed to manage these runs at a large scale.


Facilitator: Juan Jose Garcia Mesa
Project Title: Deep Generative Models for Animal Behavior: Deep Faking Wasps to Understand Individual Recognition
Researcher: Ted Pavlic

Project Description: The research project aims to explore individual recognition in animal behavior, specifically focusing on paper wasps of the genus Polistes. While individual recognition has been extensively studied in vertebrates, its understanding of invertebrates is limited. Paper wasps, known for their complex social systems, offer a unique opportunity for observation in laboratory settings. The proposal involves using generative AI to create deep fakes of wasp interactions, where videos of interactions between wasps are altered to depict the faces of other individuals. This approach allows for manipulating social hierarchies and testing hypotheses about hierarchy formation and maintenance.


Facilitator: Dan Jackson
Project Title: CoMSES Net: the Network for Computational Modeling in the Social and Ecological Sciences
Researcher: Allen Lee

Project Description: CoMSES Net is an NSF-funded science gateway and global research community that serves computational modelers interested in studying complex social and ecological systems. This exploratory project in AI/ML aims to 1/ summarize existing computational models 2/ provide concrete guidance on model analysis and documentation via a chatGPT / llama-like interface and 3/ improve computational model discoverability across multiple domains. Working closely with the COMSES development team and student developers, prototype curation workflows will be implemented to explore new research directions.

2023 Research projects

Facilitator: Rebecca Belshe
Project title: Enabling Multiple Allele Effects Faculty
Researcher: Michael Lynch

Project description: This project pairs CI facilitator Rebecca Belshe with Michael Lynch of the ASU Center for Mechanisms of Evolution. A computational model of phylogenetic lineages incorporating effects such as selection and drift incurs a large memory footprint with multiple mutations. Several multidimensional arrays have been employed to characterize the dynamics, including interference between these mutations, within the population. These arrays are very sparse, and their scale limits the number of mutations that can be included in the model. This project aims to explore and implement a new memory model, benchmark its impact on performance, deliver a new working code, and present a summary report. The project is scheduled to conclude in January 2024.


Facilitator: Susan Massey
Project title: TCGA Sex Chromosome Status Pipeline Faculty
Researcher: Melissa Wilson

Project description: This project pairs CI facilitator Susan Massey with Melissa Wilson of the ASU School of Life Sciences. The goal of this project is to develop a workflow for the analysis of sex chromosomes in cancer genomics. Sex chromosomes have long been overlooked in cancer research. As such, there is a need to assess the status of sex chromosomes in the existing sequenced samples of the Cancer Genome Atlas and evaluate their impact on patient outcomes. This project supports the effort to study this through the development of a computational workflow to import TCGA genomic data and determine the presence of a Y chromosome and/or the status of XIST, a marker of X chromosome inactivation, in the samples. The resulting inferred sex chromosome complement will be compared to subjects’ clinically recorded sex, and will then be used in further analyses of outcome (survival analysis) and disease severity (cancer stage). After completion, this reproducible workflow can be applied to additional cancer types. The project is scheduled to conclude in February 2024 with the delivery of a data table, an R script, and a summary report.

CIREN mailing list

Subscribe to our mailing list and stay up to date on the latest news and events at CIREN.