主管:中国科学院
主办:中国优选法统筹法与经济数学研究会
   中国科学院科技战略咨询研究院

Chinese Journal of Management Science ›› 2017, Vol. 25 ›› Issue (10): 100-108.doi: 10.16381/j.cnki.issn1003-207x.2017.10.011

• Articles • Previous Articles     Next Articles

Sparse Storage for Super-large-scale Linear Programming and Methods for Identifying and Disposing of Duplicate Rows in its Presolving

WU Yu1,2, HUANG Si-ming1   

  1. 1. Institutes of Science and Development, Chinese Academy of Sciences, Beijing 100190, China;
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2016-06-30 Revised:2017-02-14 Online:2017-10-20 Published:2017-12-15

Abstract: With the arrival of the big data era, it is certain and inevitable that the size of linear programming problem is becoming bigger and bigger. In response to super-large-scale linear programming problems, in order to save the storage space,avoid waste of resources, and make the data's inspecting, modifying and striking out more convenient, how to store data is an urgent and important problem. In this paper, a data structure for data's sparse storage is proposed, which is based on improved Orthogonal List. The performance of this method on saving storage space is verified by some super-large-scale linear programming cases from the Netlib database. Furthermore, due to the existing of much redundant data, a presolving process is often required before algorithm is used to solve the linear programming problem. Identifying and disposing of duplicate rows is one of the key steps. In this paper, the methods for identifying and disposing the duplicate rows are proposed. Firstly, the definition of duplicate rows and other related concepts are given. Duplicate rows' definition is different from common sense, in which columns with only one non-zero element have not been take into account. Secondly, combined with the proposed data storage structure, a simple method for identifying duplicate rows is proposed, which is based on classification thought and is very easy to operate.It only needs to inspect one time from the first column to the last column. Thirdly, by summarizing the existing related literature, two basic principles for eliminating redundant rows are obtained. The first step is to increase the number of one-element columns as much as possible, and the second is to reduce the number of the non-zero elements as much as possible. Then, based on these two principles, the nonzero elements of duplicate rows are classified into different sets and further the number of nonzero elements within each set is theoretically analyzed. A method for disposing of duplicate row is obtained, which not only guarantee the data's sparse degree, but also increase the number of one-element column. In the last part, firstly, through applying the proposed methods on a mini linear programming example, the concrete process of Identifying and Disposing of Duplicate Rows is exemplified. Secondly, by applying the proposed methods on some concrete linear programming cases which are selected from the Netlib database, the effectiveness of the methods is verified. From the result, it can be seen that when the proposed data structure and the methods are applied on small-scale linear programming problems or linear programming problem with little duplicate rows, their advantage may be negligible or not obvious. However, when in response to large-scale linear programming problems with dense duplicate rows, the larger the scale or the denser the duplicate rows, the more obvious the effectiveness is.

Key words: linear programming, presolving, orthogonal List, sparse storage, duplicate rows

CLC Number: