UPDATE: There are some errors in the current code, please do not train or execute until we fix them. We apologize for any inconvenience.
Check the paper CoSS: leveraging statement semantics for code summarization. Contact: shi_research@163.com
- python 3.7
- torch == 1.4.0
- transformers == 3.5.0
The Java dataset is collected from CodeSearchNet.
To download and pre-process java dataset:
mkdir dataset
cd dataset
wget -q https://s3.amazonaws.com/code-search-net/CodeSearchNet/v2/java.zip
unzip -qq java.zip
cd ..
python data_process_java.py
Python and Solidity dataset can be found here: dataset. Put the dataset folder under the root directory and preprocess:
python data_process_python.py
python data_process_solidity.py
We use Bart vocab as the word list.
The CFG generation tools are provided by PROGEX.
An example of model training settings:
lang=java #programming languages, could be java, solidity, or python
data_dir=./processed_data
python train.py \
--do_train \
--do_eval \
--do_lower_case \
--train_filename $data_dir/$lang/train.jsonl \
--dev_filename $data_dir/$lang/valid.jsonl \
--output_dir model/$lang \
--max_source_length 256 \
--max_target_length 48 \
--beam_size 10 \
--train_batch_size 5 \
--eval_batch_size 5 \
--learning_rate 5e-5 \
--num_train_epochs 10
When we get the trained model, we can generate example outputs:
python output_$lang.py
Example outputs:
Code: public void removeimageview(cubeimageview imageview) { if (null == imageview || null == mfirstimageviewholder) { return; } imageviewholder holder = mfirstimageviewholder; do { if (holder.contains(imageview)) { // make sure entry is right. if (holder == mfirstimageviewholder) { mfirstimageviewholder = holder.mnext; } if (null != holder.mnext) { holder.mnext.mprev = holder.mprev; } if (null != holder.mprev) { holder.mprev.mnext = holder.mnext; } } } while ((holder = holder.mnext) != null); }
Original Comment: remove the imageview from imagetask
Generated Comment: remove the current view.
========================================
Code: protected string getquery() { final stringbuilder ret = new stringbuilder(); try { final string clazzname; if (efapssystemconfiguration.get().containsattributevalue("org.efaps.kernel.index.querybuilder")) { clazzname = efapssystemconfiguration.get().getattributevalue("org.efaps.kernel.index.querybuilder"); } else { clazzname = "org.efaps.esjp.admin.index.lucencequerybuilder"; } final class<?> clazz = class.forname(clazzname, false, efapsclassloader.getinstance()); final object obj = clazz.newinstance(); final method method = clazz.getmethod("getquery4dimvalues", string.class, list.class, list.class); final object newquery = method.invoke(obj, getcurrentquery(), getincluded(), getexcluded()); ret.append(newquery); } catch (final efapsexception | classnotfoundexception | instantiationexception | illegalaccessexception | nosuchmethodexception | securityexception | illegalargumentexception | invocationtargetexception e) { indexsearch.log.error("catched", e); ret.append(getcurrentquery()); } return ret.tostring(); }
Original Comment: gets the query.
Generated Comment: get the query instance.
========================================
Code: private void handlehttpclienterrorsforbackend(final httprequest clientrequest, final exception e) { /* notify error handler that we got an error. */ errorhandler.accept(e); /* increment our error count. */ errorcount.incrementandget(); /* create the error message. */ final string errormessage = string.format("unable to make request %s ", clientrequest.address()); /* log the error. */ logger.error(errormessage, e); /* don't send the error to the client if we already handled this, i.e., timedout already. */ if (!clientrequest.ishandled()) { clientrequest.handled(); /* notify the client that there was an error. */ clientrequest.getreceiver().error(string.format("\"%s\"", errormessage)); } }
Original Comment: handle errors.
Generated Comment: handles an error from the server.