Receipt date: 
29.04.2021
Year: 
2021
Journal number: 
УДК: 
519.862.6
DOI: 

10.26731/2658-3704.2021.2(10).1-12

Article File: 
Pages: 
1
12
Abstract: 

This article is devoted to the problem of feature selection in regression models estimated using the least squares method. Earlier this problem was formulated as a partial Boolean linear programming problem. The LPSolve package was used to test the formulated problem. In addition, the problem of feature selection with a constraint on the degree of multicollinearity was tested. The test results completely coincided with the solutions obtained by the method of full enumeration of regressions, which confirms the correctness of the developed mathematical apparatus.

List of references: 

1. Noskov S.I. Tehnologija modelirovanija ob’ektov s nestabil'nym funkcionirovaniem i neopredelennost'ju v dannyh [Modeling technology for objects with unstable operation and data uncertainty]. Irkutsk, RIC GP «Oblinformpechat'» Publ., 1996. 321 р.

2. Noskov S.I., Baenhaeva A.V. Mnozhestvennoe ocenivanie parametrov linejnogo regressionnogo uravnenija [Multiple Estimation of Linear Regression Equation Parameters]. Sovremennye tehnologii. Sistemnyj analiz. Modelirovanie [Modern technologies. System analysis. Modeling]. 2016, no. 3, vol. 51, pp. 133–138.

3. Baenhaeva A.V., Bazilevskiy M.P., Noskov S.I. Modelirovanie valovogo regional'nogo produkta Irkutskoy oblasti na osnove primeneniya metodiki mnozhestvennogo otsenivaniya regressionnykh parametrov [Modeling of gross regional product Irkutsk region of the basis of methods of multiple estimation of regression parameters]. Fundamental'nye issledovaniya [Fundamental research]. 2016, no. 10-1, pp. 9–14.

4. Noskov S.I., Perfil'eva K.S. Jempiricheskij analiz nekotoryh svojstv metoda smeshannogo ocenivanija parametrov linejnogo regressionnogo uravnenija [An empirical analysis of some properties of the method of mixed estimation of parameters of a linear regression equation]. Nauka i biznes: puti razvitija [Science and business: ways of development]. 2020, no. 6, vol. 108, pp. 62–66.

5. Noskov S.I. O metode smeshannogo ocenivanija parametrov linejnoj regressii [On the method of mixed estimation of linear regression parameters]. Informacionnye tehnologii i matematicheskoe modelirovanie v upravlenii slozhnymi sistemami [Information technology and mathematical modeling in the management of complex systems]. 2019, no. 1, vol. 2, pp. 41-45.

6. Noskov S.I., Vrublevskij I.P., Zajanchukovskaja V.O. Primenenie interval'nogo regressionnogo analiza dlja modelirovanija ob'ektov transporta [Using interval regression analysis for modeling transport objects]. Vestnik Ural'skogo gosudarstvennogo universiteta putej soobshhenija [Bulletin of the Ural State Transport University]. 2020, no. 3, vol. 47, pp. 45-52.

7. Noskov S.I., Bazilevskij M.P. Postroenie regressionnyh modelej s ispol'zovaniem apparata linejno-bulevogo programmirovanija [Building regression models using the linear-boolean programming apparatus]. Irkutsk, IrGUPS, 2018. 176 p.

8. Noskov S.I. Kriterij "soglasovannost' povedenija" v regressionnom analize [Criterion "consistency of behavior" in regression analysis]. Sovremennye tehnologii. Sistemnyj analiz. Modelirovanie [Modern technologies. System analysis. Modeling]. 2013, no. 1, vol. 37, pp. 107-110.

9. Noskov S.I. Obobshhennyj kriterij soglasovannosti povedenija v regressionnom analize [Generalized criterion for the consistency of behavior in regression analysis]. Informacionnye tehnologii i matematicheskoe modelirovanie v upravlenii slozhnymi sistemami [Information technology and mathematical modeling in the management of complex systems]. 2018, no. 1, vol. 1, pp. 14-20.

10. Noskov S.I., Bazilevskij M.P. Mnozhestvennoe ocenivanie parametrov i kriterij soglasovannosti povedenija v regressionnom analize [Multiple Parameter Estimation and Behavior Consistency Criterion in Regression Analysis]. Vestnik Irkutskogo gosudarstvennogo tehnicheskogo universiteta [Irkutsk State Technical University Bulletin]. 2018, no. 4, vol. 135, pp. 101-110.

11. Bazilevskij M.P., Noskov S.I. Ocenivanie indeksnyh modelej regressii s pomoshh'ju metoda naimen'shih modulej [Estimating Index Regression Models Using Least Modules]. Vestnik Rossijskogo novogo universiteta. Serija: Slozhnye sistemy: modeli, analiz i upravlenie [Bulletin of the Russian New University. Series: Complex Systems: Models, Analysis and Management]. 2020, no. 1, pp. 17-23.

12. Bazilevskij M.P., Noskov S.I. Formalizacija zadachi postroenija linejno-mul'tiplikativnoj regressii v vide zadachi chastichno-bulevogo linejnogo programmirovanija [Formalization of the problem of constructing linear multiplicative regression as a partial boolean linear programming problem]. Sovremennye tehnologii. Sistemnyj analiz. Modelirovanie [Modern technologies. System analysis. Modeling]. 2017, no. 3, vol. 55, pp. 101-105.

13. Bazilevskij M.P. Programmnyj kompleks postroenija linejno-mul'tiplikativnyh regressij [A software package for constructing linear multiplicative regressions]. Prikladnaja informatika [Applied Informatics]. 2018, no. 3, vol. 75, pp. 110-123.

14. Bazilevskij M.P., Vergasov A.S., Noskov S.I. Gruppovoj otbor informativnyh peremennyh v regressionnyh modeljah [Group selection of informative variables in regression models]. Juzhno-Sibirskij nauchnyj vestnik [South Siberian Scientific Bulletin]. 2019, no. 4-1, vol. 28, pp. 36-39.

15. Bazilevskij M.P., Noskov S.I. Programmnyj kompleks postroenija linejnoj regressionnoj modeli s uchetom kriterija soglasovannosti povedenija fakticheskoj i raschetnoj traektorij izmenenija znachenij ob'jasnjaemoj peremennoj [A software package for constructing a linear regression model taking into account the criterion of consistency of the behavior of the actual and calculated trajectories of change in the values of the explained variable]. Vestnik Irkutskogo gosudarstvennogo tehnicheskogo universiteta [Irkutsk State Technical University Bulletin]. 2017, no. 9, vol. 128, pp. 37-44.

16. Konno H., Yamamoto R. Choosing the best set of variables in regression analysis using integer programming. Journal of Global Optimization. 2009, vol. 44, pp. 272–282.

17. Park Y.W., Klabjan D. Subset selection for multiple linear regression via optimization. Journal of Global Optimization. 2020, vol. 77, pp. 543–574.

18. Miyashiro R., Takano Y. Mixed integer second-order cone programming formulations for variable selection in linear regression. European Journal of Operational Research. 2015, vol. 247, pp. 721–731.

19. Miyashiro R., Takano Y. Subset selection by Mallows’ Cp: a mixed integer programming approach. Expert Systems with Applications. 2015, vol. 42, pp. 325–331.

20. Bertsimas D., King A., Mazumder R. Best subset selection via a modern optimizations lens. The Annals of Statistics. 2016, vol. 44, pp. 813–852.

21. Bertsimas D., King A. OR forum – An algorithmic approach to linear regression. Operations Research. 2016, vol. 64, pp. 2–16.

22. Konno H., Takaya Y. Multi-step methods for choosing the best set of variables in regression analysis. Computational Optimization and Applications. 2010, vol. 46, pp. 417–426.

23. Bazilevskij M.P. Svedenie zadachi otbora informativnyh regressorov pri ocenivanii linejnoj regressionnoj modeli po metodu naimen'shih kvadratov k zadache chastichno-bulevogo linejnogo programmirovanija [Reduction of the problem of selection of informative regressors when estimating a linear regression model using the least squares method to a partial boolean linear programming problem]. Modelirovanie, optimizacija i informacionnye tehnologii [Modeling, optimization and information technology]. 2018, no. 1, vol. 20, pp. 108-117.

24. Bazilevskij M.P. Otbor informativnyh regressorov s uchetom mul'tikollinearnosti mezhdu nimi v regressionnyh modeljah kak zadacha chastichno-bulevogo linejnogo programmirovanija [Selection of informative regressors taking into account the multicollinearity between them in regression models as a partial Boolean linear programming problem]. Modelirovanie, optimizacija i informacionnye tehnologii [Modeling, optimization and information technology]. 2018, no. 2, vol. 21, pp. 104-118.

25. Bazilevskij M.P. Otbor optimal'nogo chisla informativnyh regressorov po skorrektirovannomu kojefficientu determinacii v regressionnyh modeljah kak zadacha chastichno celochislennogo linejnogo programmirovanija [Selection of the optimal number of informative regressors by the corrected coefficient of determination in regression models as a partial integer linear programming problem]. Prikladnaja matematika i voprosy upravlenija [Applied Mathematics and Management Issues]. 2020, no. 2, pp. 41-54.