Giter VIP home page Giter VIP logo

research-tricks's Introduction

Scientific Research Tricks

Research is awesome! But today, due to several reasons, it is lacking some things and making some error during its development.

My objective with this project is to gather some tips and tools to improve how research is done.

Version Control System

At some point of your research did you make a bad choice or a have done mistake and had to go back some steps? Of course you did. Research is not done following a recipe on some cookbook. Mistakes are done. One should try to avoid mistakes, but do not regret about it. Learn from it.

The problem that I want to address here is what can you do when do you do a mistake.
You will probably want to take some steps back. If you are a very methodical person, you will have every step taken wrote down on some logbook or similar. But I guess that not everybody is like this.

Or maybe you keep track of changes like this:

A Stroy Told in file Names

This is not an optimized way of tracking your files. I can see only one possible future to this and it is a totally mess.

So, why do you not try a Version Control System (VCS)?

A VCS ["is the management of changes to documents, computer programs, large web sites, and other collections of information."] (http://en.wikipedia.org/wiki/Revision_control) With this you can keep track of any change you do on your research and fallback to any point that you want.

One VCS widely used today is git. It was created by Linus Torvalds, the same creator of the Linux Operation System. Fun fact: like Linux, Linus named git after himself. Git is British English slang for a stupid or unpleasant person, and Linus said "I'm an egotistical bastard, and I name all my projects after myself. First 'Linux', now 'git'.". Git is free and open source.

You can try the basics of git here and this book by Scott Chacon or this another book by Richard E. Silverman can also help.

Backup and Sharing

So do you want to do a backup of your research and/or work on multiple computers? You could try a USB stick or a portable HD. That is a way to do it, but maybe not the best. Maybe a cloud storage service like Dropbox or Google Drive. I would guess that this a better way to do it. Dropbox has a rudimentary VCS built-in. Maybe you want to share with your collaborators? You could create a shared folder on Dropbox.

But may I suggest something better. Something stronger. Github. It "is a web-based hosting service for software development projects that use the Git revision control system." You can store all your Git projects on Github and share with you collaborators. You can browse your and others research online on your preferred browser. Today you almot not need to know git to use it.

Regular expressions

How to find some pattern on a text.

Useful links:

Bash

You are doing your research on Linux, right? No!? Well, so you will probably want to skip this section.

If your are running Linux, you will probably have Bash as your Unix shell. Some people are afraid of the terminal. Do not. The terminal is one of your most powerful ally. If you know it well and dominate it, what you can do is almost magical.

Talk about some commands and use of pipe to send the result of one command to another

e.g

  • List files on one folder and send it to a file: ls path > list_of_files.txt;
  • Finding files with RegEx: find path | grep 'add-your-regex-here'
  • Removing files found with previous command: find path | grep 'add-your-regex-here' | xargs rm
  • Convert all figures in a folder from one type to other: for f in *.jpg; do convert ./"$f" ./"${f%.jpg}.png"; done

Bonus Tip: Bash git prompt is a resource that gives basic git information of the repository directly on the prompt.

Script programming

Talk how to use Script programming, specially about python.

You can learn python basics at CodeAcademy. I have tried and it is very nice.

There is a tutorial for non-programmers here.

This link and this contains a compilation of free books.

Good Editor

It is very important to have a good editor.

If you are the nerd-geek-kind-of-awesome guy, you should probably try VIM or EMACS.

Note to myself: Describe VIM, EMACS, Geany.

Unit testing

Automation

How to automate things using python. This is great when you do a miskate and had to run several things again.

Some links to check:

Negative results

Negative results are still results. Publish them. Why publish it? Just look the following cartoon and I guess you will understand.

Negative Data

Got it?

Probably, a journal will not accepted a paper showing negative results. And because of this, enters Figshare. This resource "allows users to upload any file format to be made visualisable in the browser so that figures, datasets, media, papers, posters, presentations and filesets can be disseminated in a way that the current scholarly publishing model does not allow. Each image, presententation or any other kind of data receives a DOI that will uniquely identify you data and it will allow others to cite it.

Visualization

Visualize your data is a very important step. I must say, crucial. It is very hard to obtain any results without it. Unless you are a statistic ninja, and even that, you would probably use visualization to convince your audience of your findings.

On of the fathers of this field is Edward Tufte. I have not yet read it, but hirs first book, The Visual Display of Quantitative Information is very highly recommended. Write more about him.

So, how one can visualize his/her data? There are several tools available. You could use a spreadsheet editor, like LibreOffice Calc. The resources are limited, but you can get a visualization quickly and it is very good when one does not have any other skils.

One can draw it by hand, or use GIMP or Inkscape, for example. But, for that, it would probably be good to have some artistic skills. And also, if something change on your data, you would probably have to start it from scratch.

Also, one could code some routine to do visualization. The advantage of this method is that probably you will only have to code it once, i.e. the code is data independent. If the data changes, you only have to run it again. Well, achieving this will only depend of your coding skill.

There is several good visuzalization toolkits on the wild, like D3 for javascript, and, for python, we have Matplotlib and Mayavi. If you want to learn D3, the book by Scott Murray, Interactive Data Visualization for the Web, is now available online for free.

Also, Bret Victor is working on a software that will allow to dynamically draw your visualizations. If you want to know more about it, you can check his talk and here. It is really impressive!

And according to Nathan Yau from Flowing Data, there is only one way to learn visualization: work with data

I also recommend his first book as a starting point to visualization. He address what type of visuazliation there are, the tools available and all on a very pleasant reading content. I have not yet read his second, but I am very anxious to get a copy.

research-tricks's People

Contributors

gabraganca avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.