With the start of my MRes at University of Portsmouth, one of my main goals is to improve my data handling and data analysis skills. I have very rusty and rather limited skills in using Python, which I used to build and clean a corpus for English for Specific Purposes with the open source tools from Masaryk University NLP Centre & Lexical Computing (n. d).
However, Python alone is not a standard tool for handling or analysing data. One of the most useful tools, it seems, is R (The R Foundation, n. d.), a programming language and software. I have played with it before but not given sufficient time to it. Having looked just a little closer, it may be more intuitive than I had thought.
Because I aim to maintain my code for use in other projects, I should also learn how to use Git (The Git Project, n. d.) software for version control. This should help save me some time by allowing me to revert back to a previous version of any programs I write when I find bugs in new versions.
There are other data analysis tools out there. MATLAB is apparently quite widely used. The only problem is that it is expensive and that the code is not easily portable and therefore not easily verified by peers. There is a better post by Olivia Guest (2017) here.
Git Project, The (n. d.) Git. Retrieved from https://git-scm.com/ August 31st 2018.
Guest, O. (2017) I Hate Matlab: How an IDE, a Language, and a Mentality Harm. Neuroplausible, March 17th 2017. Retrieved from http://neuroplausible.com/matlab August 31st 2018.
Masaryk University NLP Centre & Lexical Computing (n. d.) Corpus Tools. Retrieved from http://corpus.tools/ September 15th 2017.
MathWorks Inc., The. (1994-2018) MATLAB. Retrieved from https://jp.mathworks.com/products/matlab.html August 31st 2018.
Python Software Foundation (2001-2018) Python. Retrieved from https://www.python.org/ August 31st 2018.
R Foundation, The. (2018) The R Project for Statistical Computing. Retrieved from https://www.r-project.org/ August 31st 2018.