Giter VIP home page Giter VIP logo

algbench's Introduction

PostDoc Researcher at the Algorithms Group (Prof. Fekete), TU Braunschweig, Germany.


I specialize in solving NP-hard optimization problems in practice, combining techniques such as Mixed Integer Programming, Constraint Programming, SAT solvers, and Meta-Heuristics. My current preferred framework is Adaptive Large Neighborhood Search, which allows me to effectively mix various ideas to obtain high-quality solutions even for complex problems. I describe myself as an algorithm engineer because my strength lies in engineering algorithms with any means necessary. It is difficult to fit me into a single discipline or role, so let me explain the different aspects of my work.

As a software engineer, I write high-quality code and have experience setting up, maintaining, deploying, and managing software projects. Algorithms can become complex and often require frequent iterations, so it is essential to properly modularize them to keep them testable and maintainable. I am comfortable with tools like Git, Docker, and CI pipelines, which help me integrate and deliver solutions efficiently. I am also familiar with low-level details like caching, branch prediction, linking, and mixing languages. While I quickly learn new frameworks, I sometimes need a refresher on topics like design patterns since I am not a full-time developer.

In theoretical computer science, I have a strong foundation, with my Master’s thesis and much of my PhD work focused on this field. I understand concepts like complexity, reductions, and approximation, and can quickly identify problem types like Set Cover or opportunities for dynamic programming. My Master’s thesis addressed a notable open problem (#53 of the Open Problems Project), and I have enjoyed working on topics like Minimum Scan Cover, which involved elegant theoretical proofs. However, I have become less interested in lengthy NP-hardness proofs or abstract concepts that do not have practical applications.

In mathematical optimization, I enjoy translating problems into quantifiable mathematical formulations and am familiar with concepts like cutting planes, integrality gaps, duality, symmetry breaking, and decomposition methods. While I might not be fluent in every technical term and might resort to some hand-waving when explaining more complex methods like the ellipsoid method, I am comfortable bending the rules of mathematical precision to achieve practical results.

As an operations researcher, I build efficient models and apply algorithmic techniques, especially when standard approaches fall short. I value the power of meta-heuristics for tasks like warm-starting or providing quick feedback. Although I am not deeply involved in the business aspects and prefer working directly with solvers rather than using modeling languages, I find that I am particularly valuable when a problem requires more than just running a straightforward solve command.

As a data scientist, I regularly work with data using SQL, NoSQL, Pandas, and other frameworks. Data analysis and visualization are essential for the empirical evaluation of algorithms. I often apply techniques from machine learning, such as PCA and clustering for benchmark building, and have successfully used reinforcement learning for an optimization problem where classical methods failed. I take satisfaction in creating clean and well-documented data pipelines, often using tools like Pydantic. However, these are just tools for me, and while I am interested in AI and eager to explore new techniques, my experience with complex machine learning projects is limited.

In my role as an instructor, I have co-designed and taught courses on algorithm engineering, supervised numerous theses, and developed engaging lab exercises, which are my preferred teaching method. I received a lot of positive feedback for the CP-SAT Primer, which started as a cheat sheet but is now close to becoming a complete book. While I find teaching fulfilling, I recognize that it would not be enough to sustain me if it involved repeatedly covering the same material.

As a researcher, I have co-authored numerous papers and enjoy diving deep into new topics. I have presented at various conferences and collaborated with many peers from different fields. However, I often find the academic system frustratingly slow and inefficient. In contrast, I appreciate the faster pace and smoother processes in consulting work, where progress is more tangible and less encumbered by bureaucracy. Nonetheless, I enjoy the freedom academia provides and the collaborative, non-competitive community it fosters.

Outside of my professional roles, I am passionate about powerlifting and take a strategic approach to my training, aiming to maximize gains with minimal time investment. Given that I rarely have more than three 45-minute sessions per week, I face an intriguing optimization challenge: how to make the most progress while minimizing the risk of injury. This involves carefully balancing training variables, despite the noisy feedback from factors like sleep quality, which can significantly impact performance. The challenge of fine-tuning my regimen to account for these variables mirrors the problem-solving mindset I apply in my work.


🔬 Research 📖 Publications 🎓 Dissertation
Explore my ongoing research projects. Discover my published works. My dissertation for a coherrent sample of my work (2017-2022).

🤝 Let's Connect!

I am open to research stays and collaborative opportunities to broaden my expertise and contribute to groundbreaking research. Let's work together to solve challenging problems and make a lasting impact.

algbench's People

Contributors

chekmanh avatar d-krupke avatar racopokemon avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

chekmanh

algbench's Issues

AlgBench will not notice an interrupt if the interrrupt was handled by a native extension (like Gurobi)

There is a problem with bad entries when interrupting the benchmark via ctrl+c. Solvers like Gurobi and CP-SAT handle the signal on their own and stop the search but do not interrupt the Python-interpreter. Thus, they just terminate early and create bad entries (essentially, your shortened the time limit for this entry).

Check if there is some way to find out if ctrl+c was pressed during the creation of an entry.

JSON-support for daytime and np.arrays

Suggestion from Michael: Extend the json compatibility to daytime objects and np.arrays. Might be easily realized by providing a decoder and functionality to the json converter.
Would this make for a useful addition?

Create a "Changelog" reporting function

It is often interesting to see, in which timeframe and on which hardware the dataset has been created. It should be easy to automatically create a list with
date | hostname | git revision | package versions | number of entries

This should probably also have an export function, such that it can be provided in readable form for archived projects.

Allow deleting data to save space

The data saved by AlgBench is supposed to enable you to debug stuff and investigate observations. However, it can accumulate to many gigabytes.

  1. You may want to clean up the results but still mark the parameters as already run and processed.
  2. You may want to delete old stdout captures etc., without deleting the whole entry in order to have a more efficient database.

Write an efficient function that can reduce that data without deleting the "already run" property. There needs to be some thoughts about how to deal with the individual elements.

Create a simple function to check if the data is complete

A very serious problem we had recently was that a third party, that did not use AlgBench, executed their experiments without any error reporting. If during the execution of multiple weeks the network or the NFS had some short term problems, entries went missing without anyone noticing. Only after the painful data analysis, we noticed inconsistencies. Because the framework in the background was not changed (and simply was bad), also after two further iterations, we noticed inconsistencies due to incomplete data or wrongly copied data.

To make sure, the data generated with the AlgBench-Framework is complete, it would be good to have a verify-mode, that does not executed anything but just throws an error, if it would. This could be a ReadOnly-Mode will do nothing if the data already exists but throws an exception if it does not.

In this context, it may also be useful to have a finalize() function that sets a ReadOnly-Flag in the internal JSON, which will trigger this mode.

Create an "archive"-function that makes a benchmark read-only

When writing a research paper, you may want to freeze a dataset and prevent accidental changes to it.
It would be easy to add a flag to a dataset that prevents algbench of manipulating it. Maybe, it should also create a hash for verification.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.