Re: [eclipse-clp-users] LP solvers via eplex

From: Matthew Skala <mskala_at_...203...> Date: Tue, 20 Jul 2010 12:40:11 -0400 (EDT) · This archive was generated by hypermail 2.3.0 : Wed Sep 25 2024 - 15:13:20 CEST

On Tue, 20 Jul 2010, Kish Shen wrote:
> I am interested in hearing about user experiences with large problems with
> eplex -- how big a problem were you trying to solve? I am also interested in
> hearing about the relative amounts of memory required by ECLiPSe for the
> modelling, as compared to what the external solver need for solving.

Hi - I don't have a lot of interest to say.  The issue was that I had a
solver working pretty well using the ic_sets library, but to satisfy some
third parties who had unrealistic expectations of ILOG CPLEX as a magic
bullet for all optimization problems, my group asked me to check whether
CPLEX would be significantly better than ic_sets.  Thus, I didn't have a
strong incentive to spend a lot of time working on trying to get CPLEX to
work: I didn't need it to get my problem solved.  I also don't have a lot
of detailed records of what I tried and what worked and didn't; it was
never a systematic investigation of the limits of CPLEX.

The general situation, though, was that I had a problem involving encoding
a partial order with a few thousand elements into a set-programming
problem with a few thousand sets, a few hundred to a thousand elements per
set, and pairwise constraints between a significant fraction of the
two-set pairs (thus, millions of pairwise constraints).  That's much too
big for any solver; so I was doing a fair bit of processing in ECLiPSe
first, to use the known structure of the problem to break it into a lot of
smaller instances of set programming of varying sizes.  That is what I
mean by the "modelling" step.  A paper resulting from this work
(presented just last week) is here:
   http://www.aclweb.org/anthology/P10-1153

Further transforming the set programming into integer programming meant
turning each set into a vector of zero-one variables, one for each element
that might or might not be in the set.  In fact my pairwise constraints,
which were of the form "the intersection of these two sets can be at most
this many elements" ended up requiring me to make another such vector for
each constraint; so encoding the entire partial order would have meant
billions of variables.  CPLEX died a lot sooner than that.

These numbers are from memory, but I think in my situation I needed to
give ECLiPSe about 1G of memory to keep its global stack from running out
while it did the splitting into smaller instances; CPLEX's documentation
said it needed (when using the most memory-conservative settings) about a
K of memory per variable; and given the basically cubic number of
variables relative to problem size, figuring 3G of RAM for CPLEX divided
by 1K, cube root, gives a problem size of 143 (sets, elements per set,
etc.)  I think it actually was dying for me a little sooner than that,
when the number of sets and elements in them was in the range of 50 or 60,
but the estimate is probably still good to within the quality of
back-of-the-envelope guessing.  (I'm not sure what the number of
CPLEX-wise constraints would be, but probably comparable to the number of
variables; maybe that should be included in a more accurate estimate.)

I'm not sure that the nature of shared library loading actually would
allow CPLEX to use every byte of the address space that ECLiPSe didn't,
anyway.  I wouldn't be surprised if it was something like Linux reserving
1G for itself and ECLiPSe taking 2, leaving only 1 for CPLEX.  But you
probably know a lot more about that than I do.

All in all, I didn't think that CPLEX or ECLiPSe's interface to it were
doing anything they shouldn't; the real issue was that CPLEX was not an
appropriate tool for solving our problem... but I had been asked
specifically to check that in practice at least a little bit, rather
than just assuming it.  The take-away lesson for people who do want to use
CPLEX is simply that if both it and your main ECLiPSe program need a lot
of memory, on a machine where address space is more limiting than
physical RAM, there may be some value in splitting them into separate
processes.
-- 
Matthew Skala, postdoctoral researcher, Universities of Toronto and Waterloo
mskala_at_...203...    mskala@...205...    mskala@...206...