Welcome to Intel® Software Network Quick Login | Join | Help |
Search in Intel® Software Network Forums
in Go

My comments

Last post 11-01-2007, 11:49 AM by dultavares. 6 replies.
Sort Posts: Previous Next
 10-20-2007, 7:20 PM 30242417  

My comments

I know some of these points have been mentioned/addressed in other threads

I was disappointed to hear that this would be judged on the total run time (including I/O).  I/O consumes 90% of the run time (in my tests) and I've seen it vary from run to run by as much as 100%.  Once I found this out, I immediately stopped optimizing my QuickSort and started working on the I/O.  I was able to cut the I/O time by 50% (over scanf/printf) but would have preferred to spend that time on the QuickSort where I would have worked on improving memory allocation.  It would have been more interesting ... and I think more in the spirit of the contest, but would have had a much smaller payoff.

I hope that the tests will be run multiple times for each submission and that outliers will be discarded.  You may wish to consider using a dedicated drive or some form of SSD.  I know SSD's are not generally as good at serial access, but they are not prone to fragmentation problems or the mechanical delays assoicated with disks so they should produce a more reliable/repeatable test.

I also hope that some of the tests will include ascending and descending sequences, this is often a worst-case for many quicksorts (include the one given as an example).

TBB
It's been years since I've written any C++ (and I haven't missed it).  It's fair to say that I'm syntactically challenged.  More  tbb examples would be great, especially simple ones that isolate a single feature.  I wondered if there was a way to tell if the scheduler was already running.  I wanted to put code in that would start the scheduler if it wasn't running and terminate it only if I had started it, but I didn't see a way to do

All in all I enjoyed myself and I don't envy y'all the task of running all the tests and reviewing all that code. 

Thank you for conducting this contest and I look forward to the results and the future problem sets.
 
 10-20-2007, 8:50 PM 30242419 in reply to 30242417  

Re: My comments

Since this competition encourages parallel design on the sorting part, I think the judges will be thoughtful enough to read our description and code, and add the tick_count objects around the core processing to measure performance. The judges will have to compile these files anyway, adding a few lines to isolate the kernel wouldn't be a big problem, though quite troublesome.
If IO optimization really needs to be done, I think the focus shouldn't be at the ascii manipulating part, but on the parallelism. For example, we can read the file into several char arrays swiftly, and then use multiple threads to do the atoi job. After that, the recognized values are gathered in a big array, as the real input. This can be done efficiently with pipeline and filters. This should have almost linear gain.
 
 10-25-2007, 8:16 AM 30242639 in reply to 30242419  

Re: My comments

As for I/O it would be a nonsense: time depends on hardware and disk format (NTFS or FAT32) and file allocation -- after disk defragmentation the time may change, etc.

As for tick_count: Intel VTune Performance Analyzer is better. Also, rdtsc CPU instruction may be used...

 
 10-29-2007, 12:45 PM 30242817 in reply to 30242419  

Re: My comments

I share your confidence that the judges will be thoughtful and fair.  I hope that they don't have to wrap core methods with timing code to evaluate performance.  I think their job will be difficult enough just getting everything built and the code inspected.  I was trying to point out that timing the total run-time may have unexpected timing variations due to the nature of disk I/O.  My suggestions were intended to point out ways these variations might be reduced without changing the basic testing structure.

As to I/O optimization, I agree that the focus should be on parallelism, however I didn't have time to do so.  The conversion of ascii to int and back was "low hanging fruit" so I focused on that because of the return on time invested.  Reading/Writing the data is ultimately bound by the disk speed.  Even a very fast hard disk may have trouble delivering/accepting enough data to keep a high end processor busy.  On any given platform I imagine it would be difficult to assess the optimal block size needed to keep the CPU and disk as busy as possible.  Added to this is the problem that since the data is text using newline as  a delimiter you can't just read a block of data.  At best you could read a block, then "walk" (on char at a time) to the next linefeed (to ensure you get the entire number).  You also need to make sure that the I/O calls used are thread safe (or maybe tbb safe).  On the plus side you don't need to maintain the order on the input array (cause you are going to sort it anyway) so you could put the numbers in, in any order.  All in all it seems a bit complicated to try with a deadline looming.  I also think that you would find that the 'linear gain' would soon be capped when the limits of the disk I/O were reached.

 

 
 11-01-2007, 12:41 AM 30242967 in reply to 30242817  

Re: My comments

I tried parallelizing the input, and found that the added complexity was not worth the gains --- actually getting the data off the disk is the bottleneck. Of course, there may be a better way of doing parallel reads than I tried.
 
 11-01-2007, 12:55 AM 30242968 in reply to 30242967  

Re: My comments

anthony_williams:
I tried parallelizing the input, and found that the added complexity was not worth the gains --- actually getting the data off the disk is the bottleneck. Of course, there may be a better way of doing parallel reads than I tried.

2 hard disks, 2 input files, 1st thread reads file #1, 2nd reads file #2 ;-)))

 
 11-01-2007, 11:49 AM 30242992 in reply to 30242967  

Re: My comments

I tried implementing a pipeline to read the lines, convert them to ints and store them in the buffer. But I ran out of time. I might still try to finish it just to see how much improvement I'd have gained.

Maybe knowing that this gives you significant improvement may be helpful down the road.
 
View as RSS news feed in XML

Shortcuts


Tags For This Post

...

Community Tags

...