Welcome to Intel® Software Network Quick Login | Join | Help |
Search in Intel® Software Network Forums
in Go

Gosper Gun

Last post 01-26-2008, 11:49 AM by bugman. 10 replies.
Sort Posts: Previous Next
 01-03-2008, 5:28 AM 30246341  

Gosper Gun

Attachment: gosper.txt
Hi all,

if you want to test your code with a non empty grid, I have made a gosper.txt file (attached).

It is a Gosper Gun (http://en.wikipedia.org/wiki/Gun_(CA))

It will fill your grid with gliders

Best regards.
 
 01-21-2008, 12:04 AM 30247409 in reply to 30246341  

Re: Gosper Gun

Thanks for uploading this. It saved me having to do itsmiley [:-)]
 
 01-21-2008, 1:40 PM 30247462 in reply to 30246341  

Re: Gosper Gun

Thanks a lot!

It would be intersting to compare the performance of different implementations.

My program produced the output for this file in 33.95 seconds on Core2 Quad@ 2.88 GHz.

 

BR,

Peter

 
 01-21-2008, 11:42 PM 30247488 in reply to 30247462  

Re: Gosper Gun

Hello Peter,

on my (almost serial) version of GameOfLife, I got 22.32 seconds

MacPro, DualCore Intel Xeon, 2.66Ghz, 4MB L2 cache
Intel C++ compiler 10.1
profile guided optimization

I only optimized the cache friendliness of the structures used, the memory allocation and I got a 7 times speedup.

I have tried many parallel version splitting the list spatially and dealing with overlapping. But the cost of these structures was not counterbalanced by the "parallel" gain.

Good luck for the following problems.
 
 01-22-2008, 6:46 AM 30247515 in reply to 30246341  

Re: Gosper Gun

My program produced the output for this file in 25.9 seconds on Core@ 1.6 GHz.

dell640m t2050  1G

             GENERATION #100000

-----------------------**---------------
-----------------------**---------------
----------*----*----------**------**----
--------*-*----*----------***-----**----
**----**-------*----------**------------
**----**-----------**--**---------------
------**--------**--*--**---------------
--------*-*-----****--------------------
----------*-------*---------------------
----------------------------------------

 
 01-22-2008, 12:14 PM 30247541 in reply to 30247488  

Re: Gosper Gun

Hmm,

my solution seems to be a good example of negative scalability :-(

1 thread: 33.6 seconds

2 threads: 20.6 seconds

3 threads: 27.8 seconds

4 threads: 32.6 seconds

I am wondering what would happen on 8-core judges' machine :-(

 
 01-22-2008, 12:28 PM 30247543 in reply to 30247515  

Re: Gosper Gun

Oh, now I'm a bit depressed. I finally got around to timing things (using "time" of all the crazy ideas!) and my serial version took 33s, and the parallel one 39s (with two cores). I hope with more cores, and a different problem set it is actually worth it.

And denghui, I kept the output form of the sample program, and printed out a vertical strip, rather than a horizontal one.

25.9s seems very quick to me! Congratulations.
 
 01-22-2008, 1:00 PM 30247549 in reply to 30247541  

Re: Gosper Gun

You're not alone! I'm wondering if I have a bug, as I just tried mine out, and get

Threads : Seconds

1      34.9
2      38.8
3      39.2
4      38.2
8      37.8
16    37.2

The later values actually make sense, as I get better load balancing with more threads, since I chop up the world more evenly. What seems strange is that the two thread version is so much worse than the single thread (especially given that I don't use any locks or atomics, which could have contention.)

I didn't use affinity -- I wonder if that would play an important role. I also re-create the tasks, rather than recycling them, but that *should* be minor.

(I have a core 2 6600 that is only running at 1600 MHz for some reason, not 2.40G)


 
 01-25-2008, 12:03 PM 30247788 in reply to 30247549  

Re: Gosper Gun

It seems to me that the problem exhibited by striped implementations is caused by very small number of active cells. This increases the overhead of synchronization and task management.The problem becomes almost serial in this case. But if the grid is uniformly filles with 1E7 active cells, the scalability of my implementation seems to be much better:

1 thread - 143.7 s

2 threads - 95.79 s

3-threads  86.1017 s

4 threads - 85.8872 s

 

 
 01-25-2008, 3:48 PM 30247802 in reply to 30247788  

Re: Gosper Gun

Well, you left open the key question: How many cores does your machine have? (Yours at least shows a very good speed-up with a second core.)
 
 01-26-2008, 11:49 AM 30247831 in reply to 30247802  

Re: Gosper Gun

It was Core2Quad @2.88 GHz

 
View as RSS news feed in XML

Shortcuts


Tags For This Post

...

Community Tags

...