The online Data Science Ph.D. from Meharry SACS prepares you to tackle complex big data challenges using AI, machine learning, deep learning, advanced analytics and database design. You’ll gain technical expertise and learn to communicate insights to drive strategic decisions. Through hands-on projects, you’ll apply what you learn to real-world problems while considering ethical data use. As a graduate, you’ll be ready to lead and collaborate on innovative data science solutions. Courses are live and online, with interactive lectures, office hours, and hands-on projects guided by engaged faculty.
Students are eligible for scholarships up to 50% of tuition.
The Data Science Ph.D. program is designed for students with undergraduate or graduate backgrounds in data science, computer science, mathematics, statistics, engineering, information technology, or related areas who wish to contribute to data and computational sciences. The curriculum combines mathematics and statistics, AI and machine learning, advanced data analytics and database design, as well as related topics in research and discovery.
We offer live, online courses and faculty will engage with you through interactive class lectures, office hours, in-class exercises and projects. Ph.D. students will be asked to attend some in-person sessions during the program.
Highlights include:
You will have access to first-class resources including Meharry’s high-performance, supercomputing network, cloud computing environment, and robust data resources.
Completion of the program requires 75 graduate credits. This includes:
Applications for the Data Science Ph.D. program and the Biomedical Data Science Ph.D. program are accepted for the Fall Semester, according to the deadlines below. All admitted candidates will be expected to demonstrate an aptitude for quantitative and computational sciences, which may encompass mathematics, statistics, information systems/technology, fundamental programming skills, and related subjects.
Admission decisions will be based on all aspects of the application, including (1) prior academic performance of the applicant in a baccalaureate or master’s program at a regionally-accredited institution, including coursework and independent research projects, (2) relevant work experience, (3) the applicant’s statement of purpose, (4) letters of support, and test scores.
Fall 2026
March 15, 2026: Priority deadline for application. All materials outlined in Step 2 of the applicant process are due May 1, 2026.
Only completed applications will be considered. Applicants may check the status of their application by checking their application portal.
Step 1
Step 2
If the institution prefers to mail transcripts, please use this address:
Office of Enrollment Management
School of Applied Computational SciencesMeharry Medical College,
3401 West End Avenue
Suite 260
Nashville, TN 37203
All submitted transcripts become the property of the Meharry Medical College and will not be returned.
The GRE, GMAT, or MCAT test score requirement is waived for applicants for the Fall of 2023 and 2024. We do ask that any applicants submit any previous test scores. All scores must be sent directly to Meharry Medical College, via sacsenrollment@mmc.edu, in electronic form.
We will invite select applicants who have completed an application via email for a virtual interview with the faculty admission committee. Interviews will be conducted with cameras on and may be recorded for internal use only. Applicants must complete the interview process to be considered for admission. Final admissions decisions will be made from the pool of interviewed applicants.
Applicants will be notified of their admission decision via the applicant portal. Those offered admissions will be asked to communicate their decision.
Applications for admission are only accepted for the Fall Term each year. The requirements for admission are:
All applicants must have the educational equivalent of at least a bachelor’s degree in computer science; information technology/systems; mathematics; statistics; engineering; finance; biomedical, health, or life sciences; or a related discipline from a regionally accredited university in the U.S.
The prior degree(s) must encompass the following minimum coursework requirements, in which applicants must have earned a grade of B or better:
Biomedical Data Science applicants should also have at least one semester of college-level biology or related coursework in the biological/health/life sciences.
Additional coursework and/or experience is recommended, as outlined below:
Minimum Grade Point Average (GPA)
We may also consider other qualifications presented by applicants, such as strong oral, verbal, and interpersonal communication skills, as well as overall goodness-of-fit for the Ph.D. program.
International applicants must hold a degree comparable to a regionally accredited US baccalaureate or master’s degree. Applicants submitting transcripts from international colleges and universities are required to have them verified for US degree program equivalency before being considered for admission. Verification from the following organizations is acceptable:
The decision of the verifying organization must be transmitted directly to Meharry Medical College in electronic form.
Adequate proficiency of spoken and written English is essential to success in graduate study, and medical residency training at Meharry Medical College.
Please review Meharry’s F-1 English Language Policy and the College’s English proficiency requirements policy.
Applicants who are enrolled in a Ph.D. program in biomedical data science or related field outside Meharry Medical College may apply for Meharry’s BDS Ph.D. program as a transfer student. Transfer students are subject to the admissions requirements stated above; and therefore, the transfer track is only recommended for first- or second-year graduate students. Applicants transferring from another institution and who are in good standing at that institution are exempt from the graduate admission exam and TOEFL requirements. A recommendation letter from the previous advisor or program director is required. A maximum of six (6) credit hours can be transferred. For more information, please contact The Office of Enrollment Management at sacsenrollment@mmc.edu.
** Please redact sensitive information prior to sending FERPA protected data via email or contact us for other security options.
JeanLuc Nshimiyimana
sacsadmissions@mmc.edu
Data Science Ph.D. students will enroll in three concurrent courses during the Fall, Spring, and Summer Semesters.
The Pathway to a Data Science Ph.D. degree provides an outline of all degree requirements and a curriculum map, organized by semester, for completing them.
3 credit hours.
Introduction to the basic foundations of computer programming for data science, using Python, R, and SAS as problem solving tools. 1) Introduction to Python. Python syntax to write basic computer programs; Using the interpreter; Built-in and user-defined functions; Introduction to object-oriented programming in Python. 2) Introduction to R. Simple graphing; R Basics: variables, strings, vectors; Data Structures: arrays, matrices, lists, data-frames; Programming Fundamentals: conditions and loops, functions, objects and classes, debugging. 3) Introduction to SAS Programming. The SAS Operating Environment; Understanding Data and the quality characteristics it exhibits; SAS Programming Essentials: SAS Program Structure, SAS Program Syntax; Getting Data In and Out of SAS; Printing and Displaying Data; Introduction to SAS Graphics.
3 credit hours.
The concepts and structures used to store, analyze, manage, and present (visualize) information and navigation using Python, SQL, SAS, and QGIS. Topics will include information analysis and organizational methods, and metadata concepts and applications. Students will be assisted to identify disparate data sources needed to perform analysis for a given real-world problem. Typically, data from a single source will not be adequate to perform the required analysis. Students will pull data from the disparate data sources and import it into SAS, and use several SAS procedures to detect invalid data; format, validate, clean the data; and impute the data if it is missing. This will prepare the data for statistical analysis and decision modeling in SAS.
3 credit hours.
This course covers other useful mainstream programming languages for data science, beyond Python, R, SQL, and SAS. These “other” potential programming languages supplement the ability to crunch numbers, and equip the data scientist with good all-round programming skills. Programming languages covered will vary depending on industry popularity. While some of the programming languages may not be covered in detail, examples include: Java, Scala, Julia, TensorFlow, Go, Spark.
3 credit hours.
Deep dive into recent advances in AI, focusing on deep learning approaches. Foundations of neural networks. Cutting-edge deep learning models including image, text, multimodal and time-series data. Advanced topics on open challenges of integrating AI in a societal application including interpretability, robustness, privacy and fairness.
3 credit hours.
This course will cover fundamental mathematical background for statistical theories. Probability spaces as models for phenomena with statistical regularity. Discrete spaces (binomial, hypergeometric, Poisson). Continuous spaces (normal, exponential) and densities. Random variables, expectation, independence, conditional probability. The course will cover probabilities, multivariate distribution and special distribution, statistical inference, maximum likelihood methods, sufficiency, test of hypotheses, inference about normal methods, nonparametric statistics, Bayesian statistics.
3 credit hours.
Introduction to machine learning with business applications. Survey of machine learning techniques, including traditional statistical methods, resampling techniques, model selection and regularization, tree-based methods, principal components analysis, cluster analysis, artificial neural networks, and deep learning. Students implement machine learning models with open-source software for data science. They explore data and learn from data, finding underlying patterns useful for data reduction, feature analysis, prediction, and classification.
3 credit hours.
An overview of modern data science: the practice of obtaining, storing, modeling, manipulating, analyzing, and interpreting data. Emerging Big data processing frameworks. NoSQL storage solutions. Memory resident databases and graph databases. Ability to initiate and design highly scalable systems that can accept, store, and analyze large volumes of unstructured data in batch mode and/or real time. Organization, administration and governance of large volumes of both structured and unstructured data.
3 credit hours.
Tools and techniques for building statistical or machine learning models to make predictions based on data. NLP and Text Analytics, Time Series, Experimentation and Optimization.
3 credit hours.
Data visualization tools and technologies essential to analyze massive disparate amounts of information and make data driven decisions. Information and geographic visualization of health data. Hands-on experience in planning, creating and using compelling multimedia visualizations such as online maps, responsive graphs, interactive animations and GIS dashboards. Use of different visualizations to support various research activities including hypothesis formulation, data synthesis, analysis and exploration as well as communicate and share health information. Application of usability and user experience (UX) principles to evaluate the extents to which various visualizations meet expectations.
3 credit hours.
The research process investigating information needs, creation, organization, flow, retrieval, and use. Stages include: research definition, question, objectives, data collection and management, data analysis and data interpretation. Techniques include: observation, interviews, questionnaires, and transaction-log analysis.
3 credit hours.
Introduction to database concepts and the relational database model. Topics include ER Model, Relational Model, Relational Algebra, SQL, normalization, Indexing, Normal Forms, design methodology, DBMS functions, Security, Transaction Management, data-base administration, and other database management approaches such as client/server databases, object-oriented databases, and data warehouses. Strong emphasis on database system design and application development.
3 credit hours.
Principles, practices, and techniques for effective data modeling in the age of Big data.
3 credit hours.
Utilize current statistical techniques to assess and analyze biomedical and public health related data. Read and critique the use of such techniques in published research. Review of linear models, matrix algebra, and multiple analysis of variance. Introduction to random effects models, understanding and computing power for the GLM, GLM assumption diagnostics, transformations, polynomial regression, coding schemes for regression, multicollinearity. Determine what analytical approaches are appropriate under different research scenarios.
3 credit hours.
Study of Monte Carlo methods, a diverse class of algorithms that rely on repeated random sampling to compute the solution to problems whose solution space is too large to explore systematically or whose systemic behavior is too complex to model. Introduction to important principles of Monte Carlo techniques and their power. Bayesian analysis and Markov chain Monte Carlo samplers, slice sampling, multigrid Monte Carlo, Hamiltonian Monte Carlo, parallel tempering and multi-nested methods, and streaming methods such as particle filters/sequential Monte Carlo. Related topics in stochastic optimization and inference such as genetic algorithms, simulated annealing, probabilistic Gaussian models, and Gaussian processes. Applications to Bayesian inference and machine learning. Python or R for all programming assignments and projects.
3 credit hours.
Deep learning is a sub-field of machine learning that focuses on learning complex, hierarchical feature representations from raw data. The dominant method for achieving this, artificial neural networks, has revolutionized the processing of data (e.g. images, videos, text, and audio) as well as decisionmaking tasks (e.g. game-playing). Its success has enabled a tremendous amount of practical commercial applications and has had a significant impact on society. In this course, students will learn the fundamental principles, underlying mathematics, and implementation details of deep learning. This includes the concepts and methods used to optimize these highly parameterized models (gradient descent and backpropagation, and more generally computation graphs), the modules that make them up (linear, convolution, and pooling layers, activation functions, etc.), and common neural network architectures (convolutional neural networks, recurrent neural networks, etc.). Applications ranging from computer vision to natural language processing and decision-making (reinforcement learning) will be demonstrated. Through in-depth programming assignments, students will learn how to implement these fundamental building blocks as well as how to put them together using a popular deep learning library, PyTorch.
3 credit hours.
Examination of case studies. Introduction to healthcare law and ethics, making ethical decisions, contracts, medical records and informed consent, privacy law and HIPAA.
3 credit hours.
Security issues related to the safeguarding of sensitive personal and corporate information against inadvertent disclosure; Policy and societal questions concerning the value of security and privacy regulations, the real world effects of data breaches on individuals and businesses, and the balancing of interests among individuals, government, and enterprises; Current and proposed laws and regulations that govern information security and privacy; Private sector regulatory efforts and self-help measures; Emerging technologies that may affect security and privacy concerns; and Issues related to the development of enterprise data security programs, policies, and procedures that take into account the requirements of all relevant constituencies; e.g., technical, business, and legal.
1 credit hour.
Preparation for the Candidacy Exam intended to demonstrate advanced knowledge of content and materials of the six required classes.
3 credit hours.
A comprehensive review of text analytics and natural language processing with a focus on recent developments in computational linguistics and machine learning. Students work with unstructured and semi-structured text from online sources, document collections, and databases. Using methods of artificial intelligence and machine learning, students learn how to parse text into numeric vectors and to convert higher dimensional vectors into lower dimensional vectors for subsequent analysis and modeling. Applications include speech recognition, semantic processing, text classification, relevant search, recommendation systems, sentiment analysis, and topic modeling. This is a project-based course with extensive programming assignments.
3 credit hours.
Networks are discrete mathematical objects that describe systems of entities with pairwise relationship. Over the past several decades, technological advances in data collection and extraction have fueled an explosion of data in the form of networks from seemingly all corners of science. This course aims at providing the mathematical foundations of networks with a particular emphasis on their applications in modern data science, using tools from algorithmic graph theory and linear algebra. The topics include basic graph theory, network statistics, search algorithms, community detection, duality theorems and applications. The course will utilize python (e.g., Networks and Jupyter Notebook) to implement and test the techniques in graph theory and network science in synthetic and real data. Students are strongly encouraged to have some familiarity in Python prior to taking this course.
3 credit hours.
This course introduces fundamentals of signal processing along with its applications in wearable sensor devices. The course includes topics on signal acquisition, techniques on processing the signals captured, including time domain approaches for event detection, time-varying signal processing for understanding the dynamical aspects of complex systems, and finally the application of machine learning algorithms to build predictive models for early insights.
3 credit hours.
What is artificial intelligence (AI)? What does it mean for cybersecurity? And how AI can be integrated to achieve the goals of cybersecurity? This course designed to answer the above questions. In this course, a mix of key AI technologies will be introduced to support the understanding of the decision-making process when cybersecurity is concerned. The course will address key AI technologies in an attempt to help in understanding their role in cybersecurity. AI deficiently will complement and strengthen the cybersecurity practices and will improve their applications in enhancing our security.
3 credit hours.
This course presents fundamental concepts and techniques in digital image processing and understanding. Both theoretical material and computing techniques are introduced. The analytical tools and methods which are currently used in digital image processing are introduced and applied to practical scenarios. Basic digital computing knowledge and programming skills are reinforced by solving real world problems. Computational studies may be performed in R or Python.
Variable hours per semester may be offered (1–3 hours).
This course provides students an opportunity to delve into a special study of interest related to data science selected by the student under the guidance of a faculty member. The student and faculty member meet weekly to discuss the studies; the student will be required to write a comprehensive review paper on the semester’s studies.
Variable hours per semester may be offered (1–3 hours).
This course provides doctoral students with advanced research skills and strategies for conducting a literature review leading to a dissertation. Through this course, students will produce an extensive and integrative literature review related to their dissertation topic. Students will search, retrieve, summarize, and synthesize relevant studies to produce a comprehensive literature review.
Variable hours per semester may be offered (1–3 hours).
This course provides the student with the opportunity to concisely describe a data science research problem and methodology. Preparation and defense of the dissertation proposal which clearly articulates the problem to be investigated in the field of data science, literature review, and what would need to be done to complete the dissertation. Student must successfully defend the proposal before a Dissertation Proposal Committee which will determine whether the student proceeds to complete the dissertation.
12 credit hours.
Variable hours may be offered.
The completion of PhD dissertation is the culmination of the doctoral degree in this graduate program. The research topic of the dissertation must be related to the PhD in Data Science program.
Data science is an in-demand career, No. 3 on Glassdoor’s Best Jobs in America list. That demand reflects the great need for professionals who understand how to apply data science to uncover valuable insights from big data. The data science profession, according to Glassdoor, has salaries ranging from $99,000 to $238,000, with compensation increasing with years of experience and expertise.
At Meharry, you will have opportunities to present your research and apply your education to tacking real-world problems. You will also discover new solutions to complex issues. We believe this experience will prepare you for a rewarding career in academia, industry or government.