Friday, August 31, 2007

A New SSE Instruction Set: AMD Announces SSE5

In a world dominated by multi-thread desires, and often single-thread limitations, hardware advancements can make the biggest difference in performance. AMD has released a new extension for x86 hoping to address at least part of that. Dubbed SSE5, this newest generation adds power to the x86 by introducing not only a whole new instruction class, but also powerful multiply-accumulate instructions as well. Both of these advancements should deliver notable savings in compute time.




While extending the x86 instruction set with new iterations of SSE has become a regular activity in the computing industry, many of these additions are in actuality a gradual reshaping of x86 processors. Although as a general purpose CPU design x86 doesn't have any hard limitations (given enough time you can do any kind of calculation required) it has had several weak points patched up over the years. The basis of identifying and patching these weak points has been looking at what processors - general and specialized - are doing well while x86 is doing poorly at the time. Each iteration of SSE so far has then implemented features that these other processors have to erase these weak points.

All told, SSE5 includes 46 unique "base" instructions, with many of those instructions featuring several variations that work on different data types. With all of these variations, the total number of instructions introduced altogether with SSE5 is 170. For comparison's sake, the entire original x86 instruction set was a mere 80 instructions.

With SSE5, AMD is focusing on 5 groups of instructions. Those groups are:

* Fused multiply accumulate (FMACxx) instructions
* Integer multiply accumulate (IMAC, IMADC) instructions
* Permutation and conditional move instructions
* Vector compare and test instructions
* Precision control, rounding, and conversion instructions

As we hinted to earlier, many of these instructions are implementations of features found elsewhere. DSPs in particular have been and continue to be a major source of new instructions for new versions of SSE, with many of these instructions allowing for a CPU to process data for specialized cases at DSP-like speeds.

AMD has provided us with an example of such a situation, with the code for a 4x4 matrix multiplication operation, one coded optimally in SSE3, the other optimized via SSE5. The SSE3 code requires 34 instructions, meanwhile the SSE5 code does it in 20. Now there is more to the performance of such a code segment than the number of instructions (so this example isn't necessarily 41% faster) since the time to execute and retire an instruction can vary depending on the instruction, but it doesn't negate the performance improvement offered by such code.







For actual performance numbers with SSE5, AMD has told us that they've found that a discrete cosine transform - an operation important for image and movie encoding - can be done 30% faster using SSE5 than SSE3. They have even more impressive numbers for encryption processing, with a 5x performance improvement possible on certain encryption tasks, although we suspect this case is more limited than their image encoding scenario. Either way the promise of other performance improvements is there, however this is going to heavily depend on how well programmers will be able to extract additional performance out of SSE5, and how good AMD's own optimized math libraries will be once those are released.

SSE5 philosophy

Both AMD and Intel are looking primarily at future software needs when they consider which way to move with hardware advancements. The recognition that future software will benefit from parallel operations is an absolutely paramount realization. AMD is looking at compute-intensive, multi-media and security applications with SSE5. It is are targeting a wide industry adoption through many software vendors. And full tool support is expected to be available in 2008, including a fully-supported GCC compiler.

Conclusion

SSE5 extends performance boundaries with new, combined and three-operand instructions. AMD's efforts in the x86 arena reveal a very clear focus and intent. AMD is looking at the needs of the software industry and bringing forth hardware which addresses many of those needs. AMD's Light-Weight Profiling (LWP) initiative is another example which provides, via hardware, useful tools. LWP provides a way for software developers to know things they might not otherwise be able to know, and certainly without custom-developed, complex and costly runtime analysis add-ons.

Sumber :
http://anandtech.com
http://www.tgdaily.com

Read more...

Friday, August 24, 2007

MSI X38 mobo to have many new features



Sources close to MSI have told bit-tech that those waiting for the next generation Intel X38 chipset can expect a whole host of new features on the company's flagship motherboard.

The sources said that MSI will use a brand new "Circu-Pipe Liquid" technology, which is an upgraded version of what we saw on the MSI P35 Platinum. Whether it will be just liquid inside the heatpipes that you'll never see, or some form of user watercooling has yet to be seen though.

The board includes support for both DDR2 and DDR3 memory, allowing the end user who is desperate to finally have an alternative to the RD600 DFI ICFX3200 T2R/G for ATI CrossFire, that uses an Intel chipset and has a decent upgrade path. Having both DDR2 and DDR3 support on the board means that you won't need to fork out for both a (potentially expensive) motherboard and expensive DDR3 when DDR3 becomes more widely accepted.

There is a single power regulation phase for both DDR2 and DDR3, as well as separate power phases for both PCI-Express x16 slots which will deliver a full x16 bandwidth on top of support for PCI-Express 2.0 too.

The CPU now also gets a four-way dual-channel PWM design. How this is different from a simple eight-phase design, we aren't quite sure, maybe it's similar to the Abit IP35 Pro's 4x3 arrangement?

The X38 Diamond also comes with the X-Fi Xtreme Audio PCI-Express x1 card, like the P35 Diamond, and it also has two extra PCI-Express x16 slots from an extra chipset - these both support up to x4 bandwidth. However, this is at the expense of including an extra PCI slot (of which there is only one) and to be honest why include two PCI-Express x4 slots instead of a single PCI-Express x8 which is more suitable for high end RAID setups and GPU-Physics (if it ever arrives).

On the stacked rear I/O there are now eight USB ports, Firewire, PS/2, S/PDIF, a couple of Gigabit Ethernet and eSATAs as well as six 3.5mm audio jacks.

Finally MSI has dropped the quite frankly crap D-LED readout in favour of a new POST LED readout! Huzzah!

We'll leave you with what we believe is the first image of MSI's X38 Diamond motherboard to hit the 'net, but please bear in mind that this obviously isn't the final cooling solution. Instead, it'll have the aforementioned Circu-Pipe Liquid cooler which will cover the PWMs too. The board is also missing the four extra SATA ports down on the bottom right where you can see the solder points ready for them.

- 4 phase dual-channel PWM
- 4xDDR3 + 2xDDR2 Combo with Switch card
- 4 PCI Express x16 slot (Runs@ 16-16-4-4)
- supports CrossFire X16 or Triple card in the future / PCIe 2.0
- X-Fi Xtreme audio card
- Onboard LED
- **Circu-Pipe Liquid (under development)

Hmmm,,,,slot SATA nya koq cuma 2 yach..??? :lol: :lol:

Source :
- www.bit-tech.net

Read more...

Intel Core 2 Extreme X7900 launched - Mobile Powerhouse

Intel presented a new mobile gaming processor at the Games Convention in Leipzig, the Core 2 Extreme X7900. This 65nm chip is clocked at 2.8GHz with a 800MHz FSB and 44W TDP:

Just like the desktop part, Intel unlocked this processor as well, and the company is expecting that OEMs will implement the overclocking function for gamers. However, saying that is much easier than do it. On desktop side, this was a no-brainer with a massive number of 3rd party cooling vendors but laptops do not spot all that space for advanced cooling. Expect 3-3.2 GHz clock, but not much more than that. This might change once when we see Penryns coming out in force.

Intel's rep demonstrated the new processor inside two new laptops: Asus G2 and Dell's new baby codenamed The Beast. The Beast's market name was printed on a case itself, which has rather plasticky feeling - a rather boring M1730. This notebook spots interesting components in terms of central, graphics, and physics processing power.

Intel® Core™2 Duo Mobile Processor X7900 Extreme 2.8GHz 4M 800MHz Socket P (478pin) CPU for Santa Rosa System .

Processor Specifications:
sSpec Number: QZDX
CPU Speed: 2.80 GHz
Code name: Merom XE
Bus Speed: 800 MHz
Bus/Core Ratio: 6~14(unlocked)
L2 Cache Size: 4 MB
L2 Cache Speed:2.8 GHz

Package Type: Socket P (478-pin micro-FCPGA)
Manufacturing Technology: 65 nm
Core Stepping: E1 (QS)
CPUID String: 06FDh
Thermal Design Power: 44W
Thermal Specification: 100°C
Core Voltage: 1.0375 - 1.30V

Laptop Dell XPS M1730 :


Spesifikasi notebook dari Intel :


Benchmark dari Intel :




Supported Features:
# Dual Core
# Enhanced Intel Speedstep® Technology
# Execute Disable Bit 1
# MMX,SSE,SSE2,SSE3,SSSE3,EM64T
# Intel® Virtualization Technology

Sumber:
1. Dvhardware
2. Intel

Read more...