At the end of this page, you can find the full list of publications. All papers are also available on Google Scholar.
We find that Estrogen-related receptors (ERR) signaling is necessary for induction of genes involved in mitochondrial and cardiac-specific contractile processes during human induced pluripotent stem cell-derived cardiomyocyte (hiPSC-CM) differentiation.
Sakamoto, T., Batmanov, K., Wan, S., Guo, Y., Lai, L., Vega, R.B. & Kelly, D.P*.
Nature Communications 13, 1991 (2022)
To process large-scale single-cell RNA-sequencing (scRNA-seq) data effectively without excessive distortion during dimension reduction, we present SHARP, an ensemble random projection-based algorithm that is scalable to clustering 10 million cells. Comprehensive benchmarking tests on 17 public scRNA-seq datasets demonstrate that SHARP outperforms existing methods in terms of speed and accuracy.
Wan, S., Kim, J., & Won, K. J*.
Genome Research 30 (2), 205-213 (2020)
We present an interpretable and efficient web-server, namely FUEL-mLoc, using Feature-Unified prediction and Explanation of multi-Localization of cellular proteins in multiple organisms. This paper proposes a multi-label predictor based on ensemble linear neighborhood propagation (LNP), namely, LNP-Chlo, which leverages hybrid sequence-based feature information from both labeled and unlabeled proteins for predicting localization of both single- and multi-label chloroplast proteins.
Wan, S*., Mak, M. W*., & Kung, S. Y.
Bioinformatics 33 (5), 749-750 (2017)
This paper proposes a multi-label predictor based on ensemble linear neighborhood propagation (LNP), LNP-Chlo, which leverages hybrid sequence-based feature information from both labeled and unlabeled proteins for predicting localization of both single- and multi-label chloroplast proteins. Experimental results on a stringent benchmark dataset and a novel independent dataset suggest that LNP-Chlo performs at least 6% (absolute) better than SOTA predictors. This paper also demonstrates that ensemble LNP significantly outperforms LNP based on individual features.
Wan, S*., Mak, M. W*., & Kung, S. Y.
Journal of Proteome Research 15 (12), 4755-4762 (2016)
This paper proposes an efficient predictor, namely Mem-mEN, which can produce sparse and interpretable solutions for predicting membrane proteins with single- and multi-label functional types.
Wan, S*., Mak, M. W*., & Kung, S. Y.
IEEE/ACM Transactions on Computational Biology and Bioinformatics 13(4), 706–718 (2016)
We proposes an efficient multi-label predictor, mGOASVM, for predicting the subcellular localization of multi-location proteins. mGOASVM achieves an actual accuracy of 88.9% and 87.4%, respectively, which are significantly higher than those achieved by the SOTA predictors such as iLoc-Virus and iLoc-Plant.
Wan, S., Mak, M. W*., & Kung, S. Y.
BMC Bioinformatics 13, 1-16 (2012)
Machine learning for protein subcellular localization prediction
Wan, S., & Mak, MW.
De Gruyter, ISBN 978-1-5015-0150-0, 2015, Germany
Bioinformatics and machine learning for cancer biology
Wan, S., Fan, Y., Jiang, C., & Li, S.
MDPI, ISBN 978-3-0365-4814-2, 2022, Switzerland. (edited book)
Functional Connectivity Alterations in Cocaine Use Disorder: Insights from the Triple Network Model and the Addictions Neuroclinical Assessment Framework
Xu, Z., Liu R., Azzam M., Wan, S., Wang, J*.
bioRxiv 2024.11.12.623073 (2024)
SAMP: Identifying antimicrobial peptides by an ensemble learning model based on proportionalized split amino acid composition
Feng J., Sun M., Liu C., Zhang W., Xu C., Wang, J, Wang G., Wan, S*.
Briefings in Functional Genomics (accepted) (2024), | [preprint]
A review of artificial intelligence-based brain age estimation and its applications for related diseases
Azzam, M., Xu, Z., Liu, R., Li, L., Soh, K.M., Challagundla, K.B., Wan, S., Wang, J*.
Briefings in Functional Genomics, elae042 (2024)
WIMOAD: Weighted integration of multi-omics data for Alzheimer’s Disease (AD) diagnosis
Xiao, H., Wang, J., Wan, S*.
bioRxiv 2024.09.25.614862 (2024)
RanBALL: An Ensemble Random Projection Model for Identifying Subtypes of B-cell Acute Lymphoblastic Leukemia
Li, L., Xiao, H., Wu, X., Tang, Z., Khoury J., Wang, J., Wan, S*.
bioRxiv 2024.09.24.614777 (2024)
A prognostic framework for predicting lung signet ring cell carcinoma via a machine learning based cox proportional hazard model
Chen, H., Xu, Y., Lin, H., Wan, S*., Luo, L*.
Journal of Cancer Research and Clinical Oncology 150(364), 1-15 (2024)
Multi-Omics based artificial intelligence for cancer research
Li, L.#, Sun, M.#, Wang, J., Wan, S*.
Advances in Cancer Research 163, 303-356 (2024)
The context-dependent epigenetic and organogenesis programs determine 3D vs. 2D cellular fitness of MYC-driven cancer
Fang, J.#, Singh, S.#, Wells, B., Wu, Q., Jin, H., Janke, L., Wan, S., Steele, J., Connelly, J., Murphy, A., Wang, R., Davidoff, A., Ashcroft, M., Pruett-Miller, S., Yang, J.
Research Square 10.21203/rs.3.rs-4390765/v1 (2024)
Artificial intelligence for omics data analysis
Ahmed, Z.#, Wan, S.#, Zhang, F.#, & Zhong, W.#
BMC Methods 1, 4 (2024)
A review for artificial intelligence based protein subcellular localization
Xiao, H., Zou, Y., Wang, J., & Wan, S*.
Biomolecules 14, 409 (2024)
Procyanidin alleviates ferroptosis and inflammation of LPS-induced RAW264.7 cell via the Nrf2/HO-1 pathway
Zeng, J., Weng, Y., Lai, T., Chen, L., Li, Y., Huang, Q., Zhong, S., Wan, S*., & Luo, L*.
Naunyn-Schmiedeberg’s Arch Pharmacol (2023)
Editorial: Bioinformatics analysis of omics data for biomarker identification in clinical research, Volume II
Sun, M.#, Li, L.#, Xiao, H.#, Feng, J.#, Wang, J., & Wan, S*.
Frontiers in Genetics 14, 1256468 (2023)
USP1 expression driven by EWS:: FLI1 transcription factor stabilizes Survivin and mitigates replication stress in Ewing sarcoma
Mallard, H. J., Wan, S., Nidhi, P., Hanscom-Trofy, Y. D., Mohapatra, B., Woods, N. T., Lopez-Guerrero, J. A., Llombart-Bosch, A., Machado, I., Scotlandi, K., Kreiling, N. F., Perry, M. C., Mirza, S., Coulter, D. W., Band, V., Band, H., & Ghosal, G.
Molecular Cancer Research MCR, MCR-23-0323 (2023)
Embedded bioprinting of breast tumor cells and organoids using low concentration collagen based bioinks
Shi, W., Mirza, S., Kuss, M., Liu, B., Hartin, A., Wan, S., Kong, Y., Mohapatra, B., Krishnan, M., Band, H., Band, V., & Duan, B.
Advanced Healthcare Materials e2300905 (2023)
Editorial: Ferroptosis as a novel therapeutic target for inflammation-related diseases
Liang, Y., Su, Z., Mao, X., Wan, S*., & Luo, L*.
Frontiers in Pharmacology 14, 1152326 (2023)
Editorial: Single cell meets metabolism and cancer biology
Wang, J., & Wan, S*.
Frontiers in Oncology 13, 1125186 (2023)
Etiology of oncogenic fusions in 5,190 childhood cancers and its clinical and therapeutic implication
Liu, Y., Klein, J., Bajpai, R., Dong, L., Tran, Q., Kolekar, P., Smith, J. L., Ries, R. E., Huang, B. J., Wang, Y. C., Alonzo, T. A., Tian, L., Mulder, H. L., Shaw, T. I., Ma, J., Walsh, M. P., Song, G., Westover, T., Autry, R. J., Gout, A. M., Wheeler, D.A., Wan, S., Wu, G, Yang, J.J., Evans, W.E., Loh, M., Easton, J., Zhang, JH., Klco, J.M., & Ma, X.
Nature Communications 14 (1), 1739 (2023)
The nuclear receptor ERR cooperates with the cardiogenic factor GATA4 to orchestrate transcriptional control of cardiomyocyte differentiation
Sakamoto, T., Batmanov, K., Wan, S., Guo, Y., Lai, L., Vega, R.B. & Kelly, D.P.
Nature Communications 13, 1991 (2022)
Alzheimer’s disease-associated U1 snRNP splicing dysfunction causes neuronal hyperexcitability and cognitive impairment
Chen, P. C., Han, X., Shaw, T. I., Fu, Y., Sun, H., Niu, M., Wang, Z., Jiao, Y., Teubner, B. J. W., Eddins, D., Beloate, L. N., Bai, B., Mertz, J., Li, Y., Cho, J. H., Wang, X., Wu, Z., Liu, D., Poudel, S., Yuan, Z. F., Mancieri, A., Low, J., Lee, H.M., Patton, M., Earls, L., Stewart, E., Vogel, P., Wan, S., Serrano, G., Beach, T., Dyer, M., Smeyne, R., Moldoveanu, T., Chen, T., Wu, G., Zakharenko, S., Yu, G., & Peng, J.
Nature Aging 2(10), 923–940 (2022)
A sequence obfuscation method for protecting personal genomic privacy
Wan, S*., & Wang, J*.
Frontiers in Genetics 13, 876686 (2022)
Genomic profiling identifies genes and pathways dysregulated by HEY1–NCOA2 fusion and shines a light on mesenchymal chondrosarcoma tumorigenesis
Qi, W., Rosikiewicz, W., Yin, Z., Xu, B., Jiang, H., Wan, S., Fan, Y., Wu, G., & Wang, L.
The Journal of Pathology 257 (5), 579-592 (2022)
Special issue on bioinformatics and machine learning for cancer biology
Wan, S*., Jiang, C., Li, S., & Fan, Y.
Biology 11 (3), 361 (2022)
Identification of a modular super-enhancer in murine retinal development
Honnell, V., Norrie, J. L., Patel, A. G., Ramirez, C., Zhang, J., Lai, Y. H., Wan, S., & Dyer, M. A.
Nature Communications 13 (1), 253 (2022)
Improving bulk RNA-seq classification by transferring gene signature from single cells in acute myeloid leukemia
Wang, R., Zheng, X., Wang, J., Wan, S., Song, F., Wong, M. H., Leung, K. S., & Cheng, L.
Briefings in Bioinformatics 23 (2), bbac002 (2022)
Editorial: Transcriptional regulation in metabolism and immunology
Jiang, C., Wan, S., Hu, P., Li, Y., & Li, S.
Frontiers in Genetics 13, 845697 (2022)
Targeting the spliceosome through RBM39 degradation results in exceptional responses in high-risk neuroblastoma models
Singh, S., Quarni, W., Goralski, M., Wan, S., Jin, H., Van de Velde, L. A., Fang, J., Wu, Q., Abu-Zaid, A., Wang, T., Singh, R., Craft, D., Fan, Y., Confer, T., Johnson, M., Akers, W. J., Wang, R., Murray, P. J., Thomas, P. G., Nijhawan, D., Davidoff, A.M., & Yang, J.
Science Advances 7 (47), eabj5405 (2021)
YAP/TAZ maintain the proliferative capacity and structural organization of radial glial cells during brain development
Lavado, A., Gangwar, R., Paré, J., Wan, S., Fan, Y., & Cao, X.
Developmental Biology 480, 39-49 (2021)
SHARP: hyperfast and accurate processing of single-cell RNA-seq data via ensemble random projection
Wan, S., Kim, J., & Won, K.J.
Genome Research 30 (2), 205-213 (2020)
A critical role for estrogen-related receptor signaling in cardiac maturation
Sakamoto, T., Matsuura, T. R., Wan, S., Ryba, D. M., Kim, J. U., Won, K. J., Lai, L., Petucci, C., Petrenko, N., Musunuru, K., Vega, R. B., & Kelly, D. P.
Circulation Research 126 (12), 1685-1702 (2020)
MondoA drives muscle lipid accumulation and insulin resistance
Ahn, B., Wan, S., Jaiswal, N., Vega, R. B., Ayer, D. E., Titchenell, P. M., Han, X., Won, K. J., & Kelly, D. P.
JCI Insight 4 (15) (2019)
Predicting subcellular localization of multi-location proteins by improving support vector machines with adaptive-decision schemes
Wan, S*., & Mak, M.W.*
International Journal of Machine Learning and Cybernetics 9, 399-411 (2018)
Is congenital amusia a disconnection syndrome? A study combining tract-and network-based analysis
Wang, J., Zhang, C., Wan, S., & Peng, G.
Frontiers in Human Neuroscience 11, 473 (2017)
Gram-LocEN: Interpretable prediction of subcellular multi-localization of Gram-positive and Gram-negative bacterial proteins
Wan, S*., Mak, M.W.*, & Kung, S.Y.
Chemometrics and Intelligent Laboratory Systems 162, 1-9 (2017)
FUEL-mLoc: feature-unified prediction and explanation of multi-localization of cellular proteins in multiple organisms
Wan, S*., Mak, M.W.*, & Kung, S.Y.
Bioinformatics 33 (5), 749-750 (2017)
Transductive learning for multi-label protein subchloroplast localization prediction
Wan, S*., Mak, M.W., & Kung, S.Y.
IEEE/ACM Transactions on Computational Biology and Bioinformatics 14(1), 212–224 (2017)
Ensemble linear neighborhood propagation for predicting subchloroplast localization of multi-location proteins
Wan, S*., Mak, M.W.*, & Kung, S.Y.
Journal of Proteome Research 15 (12), 4755-4762 (2016)
Benchmark data for identifying multi-functional types of membrane proteins
Wan, S*., Mak M.W.*, & Kung S.Y.
Data in Brief 8, 105-107 (2016)
Mem-ADSVM: A two-layer multi-label predictor for identifying multi-functional types of membrane proteins
Wan, S*., Mak, M.W.*, & Kung, S.Y.
Journal of Theoretical Biology 398, 32-42 (2016)
Sparse regressions for predicting and interpreting subcellular localization of multi-label proteins
Wan, S*., Mak, M.W.*, & Kung, S.Y.
BMC Bioinformatics 17 (1), 1-17 (2016)
Mem-mEN: predicting multi-functional types of membrane proteins by interpretable elastic nets
Wan, S*., Mak, M.W., & Kung, S.Y.
IEEE/ACM Transactions on Computational Biology and Bioinformatics 13(4), 706–718 (2016)
mLASSO-Hum: a LASSO-based interpretable human-protein subcellular localization predictor
Wan, S*., Mak, M.W., & Kung, S.Y.
Journal of Theoretical Biology 382, 223-234 (2015)
mPLR-Loc: An adaptive-decision multi-label classifier based on penalized logistic regression for protein subcellular localization prediction
Wan, S., Mak, M.W., & Kung, S.Y.
Analytical Biochemistry 473, 14-27 (2015)
R3P-Loc: A compact multi-label predictor using ridge regression and random projection for protein subcellular localization
Wan, S., Mak, M.W., & Kung, S.Y.
Journal of Theoretical Biology 360, 34-45 (2014)
HybridGO-Loc: mining hybrid features on gene ontology for predicting subcellular localization of multi-location proteins
Wan, S., Mak, M.W., & Kung, S.Y.
PloS One 9 (3), e89545 (2014)
Semantic similarity over gene ontology for multi-label protein subcellular localization
Wan, S*., Mak, M.W., & Kung, S.Y.
Engineering 5 (10), 68 (2013)
GOASVM: A subcellular location predictor by incorporating term-frequency gene ontology into the general form of Chou’s pseudo-amino acid composition
Wan, S*., Mak, M.W., & Kung, S.Y.
Journal of Theoretical Biology 323, 40-48 (2013)
mGOASVM: Multi-label protein subcellular localization based on gene ontology and support vector machines
Wan, S., Mak, M.W. & Kung, S.Y.
BMC Bioinformatics 13, 1-16 (2012)
Processing millions of single cells by SHARP
Wan, S., Kim, J. & Won, KJ.
The 11th ACM Conference on Bioinformatics, Computational Biology and Health Informatics (ACM BCB 2020), virtual online, Sep (2020)
Hyper-fast and accurate clustering of ultra-large-scale single-cell data with ensemble random projection
Wan, S., Kim, J., Fan, Y., & Won, KJ.
The 2020 International Conference on Machine Learning (ICML) Workshop on Computational Biology, virtual online, Jul (2020)
Protecting genomic privacy by a sequence-similarity based obfuscation method
Wan, S., Mak, M.W., & Kung, S.Y.
2017, arXiv preprint arXiv, 1708.02629 (2017)
Ratio utility and cost analysis for privacy preserving subspace projection
Wan, S., & Kung, S.Y.
2017, arXiv preprint arXiv, 1702.07976 (2017)
Ensemble random projection for multi-label classification with application to protein subcellular localization
Wan, S., Mak, MW., Zhang, B., Wang, Y., & Kung, SY.
2014 IEEE International Conference on Acoustic Speech and Signal Processing (ICASSP’14), Florence, Italy, May 2014, pp. 5999-6003 (2014)
An ensemble classifier with random projection for predicting multi-label protein subcellular localization
Wan, S., MW, Mak., B, Zhang., Y, Wang. & S. Kung.
The 2013 IEEE International Conference on Bioinformatics and Biomedicine (BIBM’2013), Shanghai, China, Dec. 2013, pp. 35-42 (2013)
Adaptive thresholding for multi-label SVM classification with application to protein subcellular localization prediction
Wan, S., Mak, MW., & Kung, SY.
2013 IEEE International Conference on Acoustic Speech and Signal Processing (ICASSP’13), Vancouver, Canada, May 2013, pp. 3547-3551 (2013)
GOASVM: Protein subcellular localization prediction based on Gene ontology annotation and SVM
Wan, S., Mak, MW., & Kung, SY.
2012 IEEE International Conference on Acoustic Speech and Signal Processing (ICASSP’12), Kyoto, Japan, Mar. 2012, pp. 2229-2232 (2012)
Protein subcellular localization prediction based on profile alignment and Gene Ontology
Wan, S., Mak, MW. & Kung, SY.
2011 IEEE International Workshop on Machine Learning for Signal Processing (MLSP’11), Beijing, China, Sep. 2011, pp. 1-6 (2011)
A method of continuous data flow embedded within speech signals
Wan, S., Yao, C., Hu, Y., Zhang, G.
The 2-nd International Conference on Signal Acquisition and Processing (ICSAP’10), Bangalore, India, Feb. 2010, pp. 362-365 (2010)
RanBALL: Identifying B-cell acute lymphoblastic leukemia subtypes based on an ensemble random projection model, Cancer Research, vol. 84 (6_ Supplement), pp.4907-4907.
Li, L.., Xiao, H., Khoury, J. D., Wang, J., & Wan, S*.
AACR Annual Meeting 2024, San Diego, CA, Apr. 2024
Reducing health disparities for prostate adenocarcinoma by integrating multi-omics data via a multi-modal transfer learning approach, Cancer Research, vol. 84 (6_ Supplement), pp.4800-4800.
Li, L.., Wang, J., & Wan, S*.
AACR Annual Meeting 2024, San Diego, CA, Apr. 2024
SAMP: An accurate ensemble model based on proportionalized split amino acid composition for identifying antimicrobial peptides
Feng, J., Sun, M., Zhang, W., Wang, G., & Wan, S*.
Antimicrobial Peptides, Yesterday, Today and Tomorrow 2023, Omaha, NE, Oct (2023)
B-cell acute lymphoblastic leukemia subtype identification with an ensemble random projection-based machine learning model
Li, L., Xiao, H., & Wan, S.
CHRI Scientific Conference 2023, Omaha, NE, Nov (2023)
Integrating multi-omics data by a multi-modal transfer learning model to reduce healthcare disparities for kidney renal clear cell carcinoma
Li, L., & Wan, S.
CHRI Scientific Conference 2023, Omaha, NE, Nov (2023)
RanBall: An ensemble random projection-based model for identifying B-cell acute lymphoblastic leukemia subtypes
Li, L., Xiao, H., & Wan, S.
PCRG symposium 2023, Omaha, NE, Aug (2023)
RNA-seq and chIP-seq profiling identifies genes and pathways dysregulated by hey1-ncoa2 fusion and shed a light on mesenchymal chondrosarcoma tumorigenesis
Qi, W., Rosikiewicz, W., Yin, Z., Xu, B., Wan, S., Fan, Y., Wu, G., and Wang, L.
AACR Annual Meeting 2021, Philadelphia, PA, Apr (2021)
The estrogen-related receptor (ERR) drives cardiac myocyte maturation in cooperation with GATA4
Sakamoto, T., Wan, S., Batmanov, K., & Kelly, DP.
Circulation Research, 127(A222-A222) (2020)
Hyper-fast and accurate clustering of ultra-large-scale single-cell data with ensemble random projection
Wan, S., Kim, J., Fan, Y., & Won, KJ.
Cell Symposia, The Conceptual Power of Single-Cell Biology, San Francisco, CA, USA, Apr (2020). (postponed due to COVID-19 outbreak)
The estrogen-related receptor coordinates transcription of genes involved in mitochondrial and contractile maturation in human induced pluripotent stem cell-derived cardiac myocytes
Sakamoto, T., Wan, S., Won, K.J., & Kelly, D.P.
Circulation, vol. 140 (Suppl_1), pp. A11803-A11803 (2019). (presented in American Heart Association Scientific Session (AHA2019), Philadelphia, PA, USA, Nov (2019)
Estrogen-related receptor signaling is critical for postnatal cardiac maturation
Matsuura, T.R., Sakamoto, T., Ryba, D.M., Wan, S., & Kelly, D.P.
Circulation, vol. 140 (Suppl_1), pp. A11803-A11803 (2019). (presented in American Heart Association Scientific Session (AHA2019), Philadelphia, PA, USA)
MondoA mediates myocyte lipid accumulation and insulin resistance driven by chronic nutrient excess
Ahn, B., Wan, S., Won, K.J., Jaiswal, N., Titchenell, P.M., & D. P. Kelly.
American Diabetes Association’s 79th Scientific Sessions (ADA2019), San Francisco, CA, USA, Jun (2019) (oral)
Hyper-fast and accurate processing of large-scale single-cell transcriptomics data via ensemble random projection
Wan, S., Kim, J., & Won, K.J.
RECOMB/ISCB Conference on Regulatory & Systems Genomics with DREAM Challenges (RSG DREAM 2018), New York, USA, Dec (2018)