CS 678 - Advanced Machine Learning and Neural Networks
Assignments – Winter 2018
Note: All written assignments (except homework) are to be done with a word processor and be neat and professional. You will hand in a hard copy at the beginning of class on the due date. Good writing, grammar, punctuation, organization, etc. are important and will affect your grade.
Your projects will be done in groups of two. In rare situations with unique justifications, you may approach me about a group of 1 or 3. Work together closely on the projects and teach each other as you go. You will hand in one written document and share the oral presentations. Email me a thoughtful and honest evaluation of the contributions of your group members (including yourself). For each, include a score from 0 to 10 indicating your evaluation of their work (10 meaning they were a valuable member of the team that made significant contributions to the project and were good to work with, 0 meaning they contributed nothing). If you would like, you may also include any clarifying comments.
Start searching on your own and then find a partner by first of next week to create your combined proposal. Your team will give me a ranked list of three possible models from which you will be doing an in-depth implementation and presentation (see below). It will also give you an opportunity to explore the field of machine learning a little more. Do a write-up (together!) of 1 single spaced page each on three papers of your choosing in the area of machine learning which you will hand in at the beginning of class on the due date. At least two of the papers should be ones you have not read before the semester (and areas we will not directly cover this semester) and the papers should each cover a different area in machine learning. These papers should focus on machine learning models that are new to you and interesting to you. The papers should focus on innovative technical aspects of machine learning and not just applications of machine learning. For each model, you should a) briefly describe the main approach and contribution of the paper and b) give your opinion on the specific strengths, weaknesses, and contributions of the approach. Grading will be based on the perceived effort and insights you have gained. Note that one of the papers need not be new to you. That allows you to potentially implement a model that you have been considering for your research, etc. But, if so, it MUST be a model that you have not already started developing or using in your research.
In your hardcopy give me the a priority order and full reference of the three papers you would like to present. Also e-mail me PDF files (or working links to PDF) for each paper with the preferred ranking stated. I will let you know which of the three models you will implement and present. I will try to follow your ranking, but may choose another if more appropriate. By appropriate I will consider:
Š Is this a doable and reasonable model for a one semester project.
Š I prefer that no model is presented by more than one group, since the presentations are an important part of our learning experience as a class. It will be first-come first-serve if multiple groups want to do the same model, so you can hand this in as early as you like.
Some good sources for potential papers include Proceedings of the International Conference on Machine Learning, Proceedings of Neural Information Processing Systems, Machine Learning Journal, Journal of Artificial Intelligence Research, Journal of Machine Learning Research, Neural Networks, IEEE Transactions on Neural Networks, Neural Computation, among others.
Model of your choice Project and Presentation
Hard copy due at the beginning of class on the day you give your oral presentation of the model (March 8th or 13th). This project must be started early in the semester. Your write-up should be formatted like a conference paper (5 page limit). You will usually have a small bibliography with more than the initial paper from the readings assignment above.
1) Implement an ML model of your choice. This will be a model not previously implemented for another class. You may choose one of the models discussed in the class, or one that you have an interest in. The model will come from the ranked list of 3 possible models you give me in the initial assignment above. You can give it to me sooner and those who get it in soonest will have highest priority if there are multiple people wanting to do the same model. I will then give you the OK to proceed forward. This will allow me to give you feedback on your choice, and allow more diversity on the models we consider, so we can all learn from the presentations.
2) Test the model on at least 2 different data sets and for a reasonable spread of the learning algorithm parameters. Analyze and discuss your results.
3) Do a creative experiment with the model and discuss your findings.
4) In addition to your written conference style report, you will prepare a conference style talk (slide presentation) to be given to the class (~10-15 minutes). The talk should motivate your model, explain how it works, discuss results on your tasks, and give your overall analysis including recommendations for potential improvements of the model. If you are doing an algorithm that we have not discussed in class, you will need to take more time to explain how the algorithm works. In both the talk and the paper, make sure you take some time to give your overall analysis of the algorithm you use, including your perceived strengths and weaknesses of the algorithm, and include useful hints, insights, and warnings for those who might use it in the future.
You will hand in a project outline and progress report on Feb 6 which include the following:
1. What other papers you plan to include in the bibliography
2. Brief description of the implementation you will do and what you have accomplished so far
3. What data sets will you test it on
4. What creative experiment(s) are you considering
5. Proposed timeline for completion
This proposal should give your best guess on these issues. If you find better data sets, experiments, etc. as you work on your project (which is common) you may switch to those.
Deep Learning Project
Hard copy due at the beginning of class on the day you present (~April 12).
1) Your team will use deep learning (or recurrent nets) to solve a task. You may choose any deep learning model and you may choose your task. If you are doing deep learning for the Model of your choice project, then you need to do a very different deep learning model for this project. You do not need to have the same team. On March 8th your team will hand in a proposal of the model you wish to implement, with a proposed task(s) (data sets) to solve, along with description, comments and a proposed timeline for completion.
2) First, solve your task on a simpler shallow model (e.g. MLP with one hidden layer) and report your results as a baseline to compare with.
3) You may choose from any of the approaches we discuss in class or any other deep learning (or recurrent net) approaches you would like. This includes convolution net approaches, deep belief nets, stacked auto-encoders, LSTMs, BPTT, latest supervised approaches, etc. You CANNOT just use standard BP with multiple hidden layers. Use the produced features as input to your shallow model as at least one comparison with the baseline. Try different hyperparameters on your deep network to try to get the best accuracy possible. Give and discuss your results. If using an unsupervised approach, try to refine the weights of the entire network with Backpropagation after your initial learning and discuss improvements. If using stacked autoencoders make sure you include mechanisms to encourage sparsity (e.g. denoising, weight decay, etc).
4) Implementation – Models are best understood when you implement them from scratch, understanding the specifics of the algorithm, rather than using a black box tool already prepared. In this class we like to get “under the hood.” For example, Tensorflow is a great tool for doing deep networks, but can leave you not understanding the internals of the algorithms if you depend too much on it. On the other hand it is nice for implementing the many layers, getting fast code, visualizing results, etc. I want you to implement the basic learning modules of your algorithm. You may use a tool, like Tensorflow, to aid in testing, visualizing, trying different architectures, etc., as long as you implement the basic functionalities of the model yourself, in order to have an in-depth understanding. Thus, you must at least build the basic "layer" of your deep model youself, then you can use a tool to put those layers together and test. Also, if you have implemented it correctly yourself, and then want to start using the libraries of a tool like Tensorflow rather than your own built layer you may, especially if you are doing a basic deep model for which the tool already has implementations. However, you need to demonstrate and report, at least on a simple problem, that you get the same results with your version as with the tool's version.
5) You may try it on more than one task if you would like. That would be better.
6) Analyze and discuss your results in a written paper (4 page limit).
7) We will set aside one or two days to do presentations of your approaches and results so that we can all learn from each other. If you are doing an algorithm which we have not discussed in class, you will need to take more time to explain how the algorithm works. Make sure you take some time to give your overall analysis of the algorithm you use, including perceived strengths and weaknesses of the algorithm, and include useful hints, insights, and warnings for those who might use it in the future.