This article aims to propose and apply a machine learning method to analyze the direction of returns from Exchange Traded Funds (ETFs) using the historical return data of its components, helping to make investment strategy decisions through a trading algorithm. In methodological terms, regression and classification models were applied, using standard datasets from Brazilian and American markets, in addition to algorithmic error metrics. In terms of research results, they were analyzed and compared to those of the Na\"ive forecast and the returns obtained by the buy & hold technique in the same period of time. In terms of risk and return, the models mostly performed better than the control metrics, with emphasis on the linear regression model and the classification models by logistic regression, support vector machine (using the LinearSVC model), Gaussian Naive Bayes and K-Nearest Neighbors, where in certain datasets the returns exceeded by two times and the Sharpe ratio by up to four times those of the buy & hold control model.
翻译:文章的目的是提出和运用一种机器学习方法,利用交易所交易基金各组成部分的历史回报数据分析其回报方向,帮助通过交易算法作出投资战略决定;在方法方面,除了算法错误度量以外,还采用巴西和美国市场的标准数据集,还采用了回归和分类模型;在研究结果方面,分析了这些模型,并将其与纳基预测以及购买和持有技术在同一期间获得的回报进行比较;在风险和回报方面,模型大多比控制指标做得好,重点是线性回归模型和分类模型,通过后勤回归、支持矢量机(使用线性SVC模型)、高森·奈夫·贝耶斯和K-尼斯特·奈布尔斯,在某些数据集中,回报超过2次,夏普比率超过购买和持有控制模型的4倍。