Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Page Not Found
Page not found. Your pixels are in another canvas.
About me
About me
Page not in menu
This is a page not in th emain menu
Posts
education
CSI4107: Information Retrieval and the Internet
Basic principles of Information Retrieval. Indexing methods. Query processing. Linguistic aspects of Information Retrieval. Agents and artificial intelligence approaches to Information Retrieval. Relation of Information Retrieval to the World Wide Web. Search engines. Servers and clients. Browser and server side programming for Information Retrieval.
CSI5180: Topics in Artificial Intelligence
Semantic web technologies (RDF, RDFS, OWL). Ontology and knowledge base development. Data integration and normalization. Ontology matching. Semantic Web access through SPARQL queries. Semantic Web expansion from unstructured data (text), including Named Entity Recognition, Entity Linking and Relation Extraction from textual data. Question Answering over Linked Data. Data availability, redundancy, contextualization and trust.
COMP5900: Advanced Machine Learning
Machine learning (ML) is the scientific study of algorithms and statistical models that computers use in order to perform a specific task effectively without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence. This course will cover advanced topics in machine learning such as deep learning, transfer learning, multiview learning, clustering and Interpretability of ML methods.
presentations
publications
Trouble with the Curve: Predicting Future MLB Players Using Scouting Reports
Using natural language processing to predict future MLB players.
Danovitch, J. (2019). Trouble with the Curve: Predicting Future MLB Players Using Scouting Reports. 2019 Carnegie Mellon Sports Analytics Conference, Pittsburgh, USA. https://arxiv.org/abs/1910.12622
Linking Social Media Posts to News with Siamese Transformers
We design an efficient Siamese architecture to minimize the distance between embeddings of articles and their comments.
Danovitch, J. (2019). Linking Social Media Posts to News with Siamese Transformers. International Conference on Natural Language Computing Advances (NLCA), Vancouver, CA. https://arxiv.org/abs/2001.03303
ComplexDataLab at W-NUT 2020 Task 2: Detecting Informative COVID-19 Tweets by Attending over Linked Documents
We present Gapformer, which effectively classifies content as informative or not. It reformulates the problem as graph classification, drawing on not only the tweet but connected webpages and entities.
Pelrine, Kellin, et al. ComplexDataLab at W-NUT 2020 Task 2: Detecting Informative COVID-19 Tweets by Attending over Linked Documents.Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020). 2020. https://www.aclweb.org/anthology/2020.wnut-1.63/
The Surprising Performance of Simple Baselines for Misinformation Detection
We examine the performance of a broad set of modern transformer-based language models and show that with basic fine-tuning, these models are competitive with and can even significantly outperform recently proposed state-of-the-art methods
Kellin Pelrine, Jacob Danovitch, and Reihaneh Rabbany. 2021. The Surprising Performance of Simple Baselines for Misinformation Detection. In Proceedings of the Web Conference 2021 (WWW '21). Association for Computing Machinery, New York, NY, USA, 3432–3441. https://doi.org/10.1145/3442381.3450111 https://dl.acm.org/doi/abs/10.1145/3442381.3450111
Fast and Attributed Change Detection on Dynamic Graphs with Density of States
Through extensive experiments using synthetic and real world data, we show that SCPD (a) achieves state-of-the-art performance, (b) is significantly faster than the state-of-the-art methods and can easily process millions of edges in a few CPU minutes, (c) can effectively tackle a large quantity of node attributes, additions or deletions and (d) discovers interesting events in large real world graphs.
Huang, S., Danovitch, J., Rabusseau, G., Rabbany, R. (2023). Fast and Attributed Change Detection on Dynamic Graphs with Density of States. In: Kashima, H., Ide, T., Peng, WC. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2023. Lecture Notes in Computer Science(), vol 13935. Springer, Cham. https://doi.org/10.1007/978-3-031-33374-3_2 https://link.springer.com/chapter/10.1007/978-3-031-33374-3_2
Temporal Graph Benchmark for Machine Learning on Temporal Graphs
We present the Temporal Graph Benchmark (TGB), a collection of challenging and diverse benchmark datasets for realistic, reproducible, and robust evaluation of machine learning models on temporal graphs.
Huang, Shenyang, et al. Temporal graph benchmark for machine learning on temporal graphs. Advances in Neural Information Processing Systems 36 (2024). https://proceedings.neurips.cc/paper_files/paper/2023/hash/066b98e63313162f6562b35962671288-Abstract-Datasets_and_Benchmarks.html
talks
Gradient Descent
Slides (note: incomplete)