Rahul Gopinath
rahul@gopinath.org
Lecturer at the University of Sydney, Australia. ശ്രീദേവി's Dad. I work in the junction between Software Engineering and Cybersecurity. Interested in Program Analysis, Automatic Repair, Mutation Analysis, Specification Mining, Grammar Based Generators and Parsing. My website is at https://rahul.gopinath.org

When I first attended a conference (OOPSLA 2002), my parents paid for my economy flights from NZ to the USA, I got a spot as a student volunteer with free rego and stayed at a friend's house one-hour bus ride away from Seattle. To have to pay $1K extra to present is insane!!!

It costs $0 to publish a paper in POPL/PLDI/ICFP/OOPSLA as if accepted it gets into PACMPL journal and you can choose to present at a conference if you can afford it. It also costs $0 to publish a paper in TOPLAS. These are by far the top PL venues in the entire world.

Google has just published their scholar ranking for publication venues here (The link is for software systems).

I am somewhat new to the world of complex spreadsheets (Never had to use them before except as pretty CSV viewers). I am surprised that Excel does not allow us to rename columns to intuitive names so that I can say `=Total/Max` rather than `=S2/T2`. Is there any spreadsheet that allows this?

I have posted about a related issue before, but it seems that it is time to take a harder look at using CVEs as the touchstone for effectiveness of security tools. It has become far too easy to produce CVEs (even high severity ones) because there is limited oversight in the whole process. If you are a security researcher wondering how to evaluate your tool, please consider using Mutation Analysis as the metric. It is a well researched technique that can reliably show how your tool performs, and provide you insights with where you can improve.

We have reworked the integration with the #mybinder platform, and you can now again interact with notebooks right in your browser (now using #JupyterLab instead of Jupyter Notebook).
As an example, here’s the notebook on fuzzing with grammars: mybinder.org/v2/gh/uds-se/fuzz

You can access these from any chapter in fuzzingbook.org via “Resources” → “Edit as Notebook”. Enjoy!

Just recently read the paper "Delving into ChatGPT usage in academic writing
through excess vocabulary"
. by Kobak et al. Their premise is that (from the abstract) the [models] can produce inaccurate information, reinforce existing biases, and can easily be misused. So, the authors analyse pubmed abstracts for vocabulary changes, and identify certain words that have become more common post LLM. They find that words such as "delves", "showcasing", "underscores", "intricate", "excel", "pivotal", "encompassing", "enhancing" are all showing an increased usage, and hence suspect.

While this data is indeed interesting, I wonder why LLMs tend to use these words. Aren't LLM outputs supposed to be more of a reflection of the data they are fed in training? Surely that means that these words are more common in some data set than we expect?

Our paper "Empirical Evaluation of Frequency Based Statistical Models for Estimating Killable Mutants" on evaluation on models for estimating equivalent and killable mutants were accepted by ESEM 2024.  The paper is here. #ESEM2024 #Equivalentmutants #mutationanalysis

I am visiting ANU Canberra on Friday. If you are around, and is interested in what I do, please come talk to me.