Introduction and Packaging
The Supermicro 7048GR-TR Workstation is an enterprise class workstation built to high standards, and designed to accommodate several types of PCIe expansion devices, such as graphics accelerators, Tesla cards, and Intel Xeon Phi Coprocessors.
The supporting motherboard is the X10DRG-Q, which we reviewed earlier. The 7048GR-TR Workstation supports the latest Intel Haswell-EP processors, E5-2600 v3 series, and fast DDR4 memory.
One of the unique features of this workstation is support for 7x PCIe slots, 4x of which can be used for GPU expansion cards; the 1x slot is typically used for an AOC-TBT-DSL5320 Thunderbolt Add-On Card to provide remote operation.
The 7048GR-TR Workstation comes with dual redundant 2,000-watt power supplies (to insure uptime should a PSU fail), 8x hot-swappable drive bays, and easily manageable fan replacement if needed. As you can see, the 7048GR-TR is an impressive machine. The system we will be reviewing today has 4x Intel Xeon Phi 3120A Coprocessors installed. Around the lab, we began calling this the "Supermicro 7048GR-TR Xeon Phi Battleship," because of its capabilities. Let's move on to unpacking this machine.
If you have ever opened a Supermicro server box, then you will be right at home unpacking the 7048GR-TR workstation. Supermicro packs its servers in heavy duty, dual boxed, foam packed packaging to protect them during shipping. As you can see, there is plenty of space between the outside box and the actual workstation inside, which protects the workstation well from punctures and dents. Heavy-duty foam spacers are used to cushion the workstation, and prevent it from moving around inside the box. The packaging does an excellent job at protecting the workstation from drops and other mishandlings.
The foam spacers are split into four parts; there are two on the bottom, and two on the top. Removing the two top pieces provides access to the workstation; after these are removed, the workstation can then be lifted out. The 7048GR-TR weighs in at a net weight of 46 lbs. (20.9kg) for the basic case alone, and can easily weigh much more when fully configured, so use two people to lift the workstation out of the shipping box.
Specifications and Layout
The 7048GR-TR workstation has an impressive specification list, which comes complete with a massive eight-drive storage system, dual redundant 2,000-watt power supplies, and a whole host of other features listed above. The base motherboard used in the 7048GR-TR workstation is the X10DRG-Q, which we reviewed earlier.
The X10DRG-Q comes with five PCIe slots; however, the very first PCIe slot does not have clearance to fit a full size expansion card. This slot is typically used for an AOC-TBT-DSL5320 Thunderbolt Add-On Card, which will allow remote management in an office environment while the workstation is in a separate location.
Optional accessories are as follows:
- TPM security module - TPM module with Infineon 9655, RoHS/REACH, PBF (Vertical or Horizontal, depending on the server layout and expansion cards used)
- SuperDOM - Supermicro SATA DOM Solutions
- AOC-TBT-DSL5320 Thunderbolt Add-On Card
Let's move on and take a look at the workstation.
Here we get our first look at the 7048GR-TR workstation outside of the shipping box.
With a size of 18.2" (462mm) x 7.0" (178mm) x 26.5" (673mm), the 7048GR-TR is large, and impressive looking. The case itself can be fitted with a 4U rack-mountable rail kit for installations in server racks. In our testing, we used the case mounting feet for Tower operations. The feet simply slide into slots on the side of the case, screw into place, and with a simple rotation, the case is in upright position.
The front of the case includes a full-size lockable door, which we have opened here to show the front access to the drive bays. The door will snap into closed position, so it does not fall open by itself while you are moving it around.
We can see here there is plenty of storage bays available on the 7048GR-TR workstation; these bays include:
- 8x 3.5" hot-swap SAS/SATA drive trays
- 3x 5.25" drive bays in storage module
- 1x 3.5" fixed drive bay
We would like to see a dust filter installed on the front door because of the large air flow that moves through the case when the fans are all spooled up.
Each of the 8x 3.5" hot-swap SAS/SATA drive trays can be unlocked and removed for easy drive access.
At the top of the case, we find the power/reset buttons, status LEDs, and two USB ports that can be accessed with the front door closed.
Looking at the front of the 7048GR-TR workstation, we can see the lock for the front door and status LEDs, which are easy to see when in operation.
The 7048GR-TR workstation is 26.5" (673mm) long, and requires plenty of desk space if you are planning to run this on a desktop. The side of the case is rather plan; nothing fancy here. You can see the blue locking button, which flips open the side handle when pressed, and allows the side panel to be removed. There are also two screws on the back of this panel, which must be removed before this panel can be taken off.
The back of the workstation has all of the IO and power connections, which we can see here. If you notice, there is a blue locking button at the top, which allows the top plate to be removed if you are planning to use rails for mounting in a server rack.
After removing the side panel, we get our first look at the insides of the 7048GR-TR workstation. Yes, this is an impressive machine, and it shows that Supermicro has taken great care to design and build an enterprise quality case to house the workstation components. When looking over the system, you will find it is of high quality, strong build, and no options are left off the table.
Right off the bat, we notice the GPU locking rail, which holds the expansion cards firmly in place. Each tab on the rail can be adjusted to fit different expansion cards.
The middle wall holds 4x 92x38mm four-pin PWM controlled fans to provide the cooling needs for the case, all within easy access. All fans can be removed to provide easy access to SATA and other cables.
The X10DRG-Q is a large motherboard, and fills the entire bottom of the motherboard area.
The outfit for the 7048GR-TR workstation includes 4x Intel Xeon Phi 3120A Coprocessors that fit nicely in the enclosed space. Two high-speed case fans direct airflow through this area to supply cooling for these cards.
You can also spot the empty PCIe slot, which would normally have an AOC-TBT-DSL5320 Thunderbolt Add-On Card, if the user wants that option.
Each of the 4x 92x38mm four-pin PWM middle bar cooling fans are easily removed by simply pressing in on the locking tab, and pulling outward.
Here we get a look at the back plane for the 8x hot swap SATA drive bays.
The GPU locking rail simply attaches at the bottom of the case with two tabs that fit into locking positions, and then swings up to the top of the case. Here we see the top of the rail, which has a locking lever that slides up to lock the bar in place, and a retention screw to finish the installation. To remove the rail, simply unscrew the retention screw, pull down the locking lever, and swing the rail outward.
Here we see the storage module and its 3x 5.25" drive bays, and power, USB, and status LEDs. For rack mount installations, this can be rotated 90 degrees by simply pushing in the blue locking button, and sliding the module out. If you are building the system from scratch, this should be done before installing drives or other items.
The system we tested has 2,000-watt redundant power supplies. To replace a power supply, simply press the locking lever, and pull the PSU out.
The back of the case also includes 2x 80x38mm, four-pin PWM rear exhaust fans. To replace a fan, simply push in the locking button, and pull out.
BIOS, Remote Management and Software
The BIOS for the 7048GR-TR workstation is the same as the X10DRG-Q, and is standard for server motherboards, so we will only show a few BIOS screens, and go over new menu options.
This is the main BIOS screen, which shows basic system information.
The advanced tab brings you to the main advanced screen.
This is the Advanced CPU Configuration menu; there are many options in this area, but most work just fine with default settings.
This is the CPU Advanced Power Management Configuration screen.
Now we are looking at the North Bridge IIO Configuration menu.
Here we are looking at the QPI General Configuration menu.
Here we are looking at the Memory Configuration menu.
Here we are looking at the Memory RAS Configuration menu.
Now we are looking at the PCIe/PCI/PnP Configeration menu.
For our setup using the Intel Xeon Phi Coprocessors, we will need to enable "Above 4G Decoding."
Now we are looking at the SATA Configeration menu.
The last menu is the Boot Options menu.
We find our remote access IP address located in the BIOS, under the IPMI tab. In our case, this was 220.127.116.11. Enter that into your browser, and you will see the login screen.
To login, use:
As a best practice, administrative users should change factory default Username/Password logins before connecting any new server to their network.
After logging in, we come to the home screen and see system information displayed. There is also a remote control option for iKVM. Please note that when video cards are installed, iKVM will not be available.
The next tab is the Sensor Readings menu.
The Configuration menu allows you to change many features on the server, including Active Directory settings, DNS, LDAP, and many more.
The Power Control and Status menu allows you to power on, shut down, restart, and cycle the server.
The Virtual Media menu allows you to mount or share virtual media like floppy disk and CD-ROM images.
The Maintenance menu allows you to update the firmware and restore factory defaults.
The Miscellaneous menu allows post snooping, SMC RAKP enable/disable, UID control, and BIOS recovery.
Here we see the Drivers and Tools menu. We are happy to see Supermicro has included a driver ISO file that we can download to install drivers and tools.
The Supermicro SuperDoctor 5 is a hardware-monitoring program that functions in a command-line or web-based interface in Windows and Linux operating systems. The program monitors system health information such as CPU temperature, system voltages, system power consumption, fan speed, and provides alerts via email or Simple Network Management Protocol (SNMP).
SuperDoctor 5 comes in local and remote management versions, and can be used with Nagios to maximize your system monitoring needs. With SuperDoctor 5 Management Server (SSM Server), you can remotely control power on/off, and reset chassis intrusion for multiple systems with SuperDoctor 5 or IPMI. SD5 Management Server monitors HTTP, FTP, and SMTP services to optimize the efficiency of your operation.
IPMIView is a GUI-based software application that allows administrators to manage multiple server systems through BMC. IPMIView monitors and reports the status of servers, and also supports remote KVM and Virtual Media.
Supermicro Power Manager (SPM) is a power management tool that allows you to improve system power utilization. Administrators can configure policies by datacenter, room, row, rack, target machine, or logical groups, and can be triggered by condition, power, or temperature thresholds.
Supermicro Update Manager (SUM) remotely updates the BIOS and BMC/IPMI firmware, and system settings. The X10 based machines through in-band and OOB (out-of-band) BMC/IPMI communication channels.
Supermicro Server Manager (SSM) provides capabilities to monitor the health of servers, and many other features.
Test System Setup
The platform that the X10DRG-Q motherboard uses is the Wellsburg (Intel C612), and new Haswell-EP processors. The processor we will be using is the Intel Xeon E5-2699 v3, which features 18 cores with hyper-threading used on these tests; this will supply the processing power.
The Wellsburg Platform (Intel C612) will have support provided for four to 18 cores with dual socket capability. TDP ranges from 55W up to 160W for workstations. Memory is now DDR4, and can gave a frequency of up to 2133MHz. The E5-2600 v3 processors use 2x QPI 1.1 channels with up to 9.6 GT/s. These processors support PCIe 3.0 with up to 8 GT/s, and 40 lanes. The chipset will be Wellsburg PCH. This gives support for a huge number of SATA ports at 10. A large number of USB devices can be used with six USB 3.0, and eight USB 2.0 ports. Wellsburg (C612) also supports DMI2 with 4x lanes.
Here we get a look at the task manager for our test system, which shows 36 cores / 72 threads that 2x E5-2699 v3 Xeon processors supply. These two CPUs provide a staggering amount of cores/threads in our test system.
In our tests, we will be using the new Crucial DDR4 memory, which has a speed of 2133 MHz, and a rating of CL15. We have already taken a look at these memory kits, and you can access our review here: Crucial DDR4-2133 DRx4 RDIMM Memory Review - Testing up to 256GB.
Here we can see the timings of the Crucial DDR4 memory that we will be using in our tests. As we fill every slot with memory sticks, the speed will drop to 1833MHz.
Here we see how memory in Slot /DIMMs per channel can effect memory speed.
In the test lab, we found many new systems with multiple video cards and other expansion devices were starting to have larger power supply needs. We have now upgraded our test bed with a new Thermaltake Toughpower 1500W Gold PSU. We have already pushed this PSU to near max loads in the lab, and it has performed flawlessly. Even with max loads under full stress testing, the Toughpower 1500W Gold PSU hardly makes any noise at all, which we really like.
Here we see the Thermaltake Toughpower 1500W Gold PSU with its retail box. Before we installed components into the 7048GR-TR workstation, we prequalified the system on an open-air test bench, and for this, we used the Toughpower 1500W Gold PSU. We actually pulled more wattage than the rated 1,500 watts, but this PSU handled these heavy loads with no issues.
Here are just a few specifications for the Toughpower 1500W Gold PSU. Ok let's get on with testing, and see what kind of performance a system like this has to offer.
Intel Xeon Phi 3120A Coprocessor
Before we continue with the review, we would like to say we had a huge amount of interest from our readers on just what these cards are, and how to use them.
First off, they cannot be used as PhysX acceleration card to boost game performance, so don't expect them to run your Ubisoft games at greater than 30 FPS. When you install a Xeon Phi coprocessor in your system, they will not show up in the task manager as extra cores that your CPU shows. They have no GUI, so you will not see a desktop for these cards; instead, you run them through a command line interface.
What they are, in simple terms, is a completely self-contained x86 processor with memory that runs an embedded Linux OS called uOS. That means you can compile x86 based code to run directly on the Xeon Phi Coprocessor. These cards communicate with the host system over the PCIe buss, so code can be running on the Phi, receive data from the host system, process that data, and send it back to the host. These cards can be run as a single processing unit, or clustered and communicate over the network. They have a great deal of flexibility in how they are used.
Intel describes the intended use for these cards as follows:
"The Intel Xeon Phi Coprocessor 3100 family provides outstanding parallel performance. It is an excellent choice for compute-bound workloads, such as MonteCarlo, Black-Scholes, HPL, LifeSc, and many others. Active and passive cooling options provide flexible support for a variety of server and workstation systems.
The Intel Xeon Phi Coprocessor 5100 family is optimized for high-density computing and is well-suited for workloads that are memory-bandwidth bound, such as STREAM, memory-capacity bound, such as ray-tracing, or both, such as reverse time migration (RTM). These coprocessors are passively cooled and have the lowest thermal design power (TDP) of the Intel Xeon Phi product family.
The Intel Xeon Phi Coprocessor 7100 family provides the most features and the highest performance and memory capacity of the Intel Xeon Phi product family. This family supports Intel Turbo Boost Technology 1.0, which increases core frequencies during peak workloads when thermal conditions allow. Passive and no thermal solution options enable powerful and innovative computing solutions."
If you are using these cards for development, you will find that there are a huge amount of resources available to you on Intel's website that will get you started on compiling your own code.
Intel keeps a list of what is called "Code Recipes for Intel Xeon Phi Coprocessor," which are downloadable code and "recipes" to help you build and run these. Driver support for these cards comes with several Lunix distros like Redhat and many different versions of Windows, such as Windows 8.1 and server OS'. Intel has done a great job providing drivers and tools for Windows based systems; in fact, it is about the easiest way to get your system up and running using these cards.
For our tests, we will use the 3120A Xeon Phi coprocessor. It is simply a 57 many core processor with 6GB of DDR5 memory running embedded Linux OS called uOS.
The CPU inside the Xeon Phi is referred to as a Many Integrated Core (MIC) coprocessor, or as "Mike." The Xeon Phi card itself has a System Management Controller (SMC), thermal sensors (inlet air, outlet air, coprocessor on-die thermal, and single GDDR5 sensor), and a cooling fan for the 3120A.
We wanted to take one of these cards apart to show the insides, but thought better of that because we have to return them. However, here you can see an exploded diagram showing how these cards are put together.
From Intel's website, we can find exactly what the PCB looks like for a typical Xeon Phi card.
The next two photos show the front and back of our 3120A samples. They look very much like a typical GPU that you might add your system.
Here we see the back of the 3120A. If you notice, these cards are made from high quality parts, including a nice metal back plate, and plastic insulation to protect the covered areas.
There is a bracket to facilitate full- size cards, which aids in securing them in the case. These are mounted by four screws, and can be removed if necessary.
The power connections are very much like any GPU that we would use for graphics. 1x eight-pin and 1x six-pin power connectors provide the needed power for these cards.
Setting Up and Testing the Xeon Phi 3120A Coprocessors
Before we begin using our Xeon Phi 3120A coprocessors, or even installing them on the X10DRG-Q motherboard, we need to enable "Above 4G Decoding" in the BIOS to be able to use these cards. If you did not do this beforehand, and turned on the system to just let it run, at some point you will get an BIOS error screen. From here, you can head over to PCIe/PCI/PnP Configuration menu, and change the setting.
As we said before, we had a great deal of interest in these cards, so we wanted to walk through our experience setting and using the 3120As that we had. For a first time user, it can be a little intimidating and frustrating going through this, but once you wrap your head about around what is going on, it's pretty simple to do. Let's get started.
Intel does provide a Xeon Phi Coprocessor Quick Start Developers Guide for Windows that we will follow along with. You should download this guide, and become familiar with what we will be going over here.
At this point, we have our system installed and running Windows 8.1 Enterprise. 4x Xeon Phi 3120As are installed in the PCIe slots, and the system is up and running. We will need to install drivers for the Xeon Phi, so head over to the Intel Developer Zone Page, Tools and Downloads for mic developers, and download the Manycore Platform Software Stack (MPSS).
You are looking for the latest MPSS release for Microsoft Windows. Inside this download, you will find two install packages that you will need to install, the Intel(R) Xeon Phi(TM) coprocessor, and the Intel(R) Xeon Phi(TM) coprocessor essentials.
You are also going to need a few other programs, so let's get those while we are at it. These include
Let's also download Linpack_11.2.1 for our testing. Create a folder on your desktop called "winshare," and put the unzipped linpack_11.2.1 folder in that.
We also need to find another file called "libiomp5.so." This is located in the Intel Composer XE 2015 directory, C:ProgramFiles (x86)IntelComposer XE 2015compilerlibmic. Just copy "libiomp5.so" (without the quotations), and paste it into the winshare folder.
Right now, we should have all of the software we need downloaded and installed on our test system, and the two files placed inside the winshare folder. Make sure you have completed all these steps before you continue.
After you have installed both Intel(R) Xeon Phi(TM) coprocessor and Intel(R) Xeon Phi(TM) coprocessor essentials packages, bring up your control panel, and head to Programs and Features to confirm these two packages are installed. It should look similar to the picture above.
Now, bring up the Device Manager to check if the Xeon Phi cards show up. You can see we show 4x Xeon Phi cards here. So far, so good. It's fairly simple.
Now what we want to do is open your C: drive, drill down into C:Program FilesIntelMPSSbin, and find "micsmc-gui." You can create a shortcut for this, and place it on your desktop. Go ahead and double click it. What you will see is the top window, which is the MPSS control panel. It will display average temps, memory usage, how many watts are being used, and average core utilization.
The "Cards" button will drop down a list of all the Xeon Phi cards installed. They will be listed as mic0, mic1, mic2, mic3 for our system. This can show all of the cards, or just the main display. The next button is "Advanced," which will show error logs, card info, and card settings. Here you can reconnect or restart your cards.
Under the "Cards" button, if we select "Show All," we will see this screen. The top window shows total system use, followed by each card below. Great. So far, we have our cards up and running, and we can see what each one is doing.
Now let's bring up a command line window, and run it as Administrator.
At the prompt, enter">micinfo".
Now we will see complete information on all four cards. In this command window, we can issue many different commands to manage our cards.
Go ahead and enter">micctrl -stop".
This will stop the cards.
This will start the cards up.
At this point, find the Putty and Puttygen that you downloaded before, and place them into C:Program FilesIntelMPSSbin. We will use Putty to issue commands to the Phi cards via the SSH command window. Puttygen will create public and private SSH keys for us, which we will need to use for a secure connection.
Go ahead and run Puttygen. We used 1024 for number of bits in our key to make a shorter key. When you select "Generate" move your mouse around in the window above to create randomness to generate the key.
Your screen should look something like this when done.
Open notepad and create a file called "authorized_keys" (no .txt file extension). Copy the Public Key string, and paste it into this file. When finished, place that file into C:Program FilesIntelMPSSbin
Now hit "Save Private Key," give it a name "id_rsa.ppk," and save it into C:Program FilesIntelMPSSbin.
The C:Program FilesIntelMPSSbin folder should look just like this now.
Next, we are going to run Putty to open a SSH connection to the Xeon Phi cards. We did have some issues with this; for some reason it did not like our "authorized_keys" file.
To get around this issue, we used this command in our Windows' command line window.
> micctrl --addssh root "your key that you generated in Puttygen"
You can open the "authorized_keys" file, and copy the key there to use in this command line.
Issuing this command will upload your key to each of your cards.
Launch Putty, and you will see this screen.
For Host Name use:
This will be the first card, or mic0. For mic1, mic2, mic3 use:
mic1 - firstname.lastname@example.org
mic2 - email@example.com
mic3 - firstname.lastname@example.org
Move down to SSH, and open that selection. Then select "Auth" to see the screen above.
Here, hit the "Browse" button, and locate the "id_rsa.ppk" file. It will be located in the directory shown in the screen shot.
Then hit "Open."
Now we have an open SSH window into the root level of mic0. Here is where we had problems, it would ask for a password and would not open for us. This is why we used the"> micctrl --addssh root 'your key that you generated in Puttygen'" command before.
Once we issued that command, we had no problems, and the Putty SSH command window started just as we see here. So far, so good; we now have an open SSH connection where we can issue SSH commands to the Xeon Phi card. Next, you are going to want to download and install WinSCP, if you have not done so already.
Before we can use WinSCP, we will need to generate a password. In the Putty SSH command window, enter:
Enter a password that you will use for WinSCP.
Are you with us so far? Good. Now we are going to use WinSCP to upload LINPACK and the libiomp5.so file to the Xeon Phi cards. Launch WinSCP and change the file protocol to SCP. Enter the IP address for your Xeon Phi card, which is 192.168.1.100.
In the password field, enter the password you created in the Putty SSH command window with >passwd.
You can now save this, so you do not have to do it again. Notice we have done this for all four Xeon Phi cards in our system. Log in when completed.
WinSCP will start and open a window, just as we show here. In the left window, navigate to the "winshare" folder you created on your desktop. The right window will be looking at the root directory of the Xeon Phi card. Hit the top icon, or back one directory.
Now we are looking at all the system folders on the Xeon Phi in the right window. Looks just like a Linux machine, right?
Open the "lib64" folder, and upload the "libiomp5.so" file into it.
Let's go back one directory in the right window, and open the "Home" folder. Here you can see the list of users on the Xeon Phi. The "micuser" folder was created by us before. Upload the "linpack_11.2.1" folder now.
Now, let's open the "linpack_11.2.1" folder, and move down the directory to the "linpack" folder; you can see the listing of files in it here. If you try to run LINPACK now, you will receive permission errors, so we need to fix that.
Go back to the Putty SSH command window, and change the directories to the LINPACK folder.
Enter: # cd /home/linpack_11.2.1/benchmarks/linpack
Then: # ls
We now see a listing of files in the "LINPACK" folder. We need to change permissions on two files. Enter the following commands:
# chmod 777 xlinpack_mic
# chmod 777 runme_mic
Great, we are almost ready to run our first test.
The nest problem we ran into was with running LINPACK.
The downloaded version of LINPACK is created for a different setup with 16GB of RAM on the Xeon Phi cards, and will not work on the 3120A, which only has 6GB of RAM. Let's fix that. Select the "lininput_mic" file, and right click to bring the menu up, then select "Edit."
This is what you will see after you hit "Edit." The two lines we need to fix are the long numbered ones labeled "# problem sizes" and "# leading dimensions."
Remove the last four group of numbers in each of those. Replace them with:
Those two edited lines should now look just like what is shown above. Hit the "Save" button.
Wow, that was a lot of work to get these cards setup; I hope you were able to follow along, and get to this point without too many issues. We are now ready to run our first test.
Let's go back to our desktop and look at the MPSS control panel, and have it show the first Xeon Phi card, or mic0. We will now see two windows, one for total system use, and mic0 just below that.
Back in our Putty SSH command window, we only have to enter one command to start LINPACK:
The window now shows LINPACK running on one Xeon Phi card. You can see Core Utilization graph showing different CPU loads, temperatures should increase a bit, and watt usage should also increase.
After about 15 to 20 minutes, LINPACK will finish up ,and we can see our test results. Wow - look at that! 713 GFLOPS on one card. That is impressive!
Because we wanted to fully load our test system to measure power use, we ran LINPACK on all 4x Xeon Phi cards. We are looking at ~2,852 GFLOPS right now, which is very impressive!
Even though we only have one test, we are going to chart this result, so we can use it later when we test other Xeon Phi cards.
We will begin testing the CPU portion of our benchmarks. On our system's test, we set the BIOS to default settings, and only change the boot device. We will see the effects of power management and other power saving features here. This is how a typical system would be used.
We will also be using 2x Intel Xeon E5-2699 v3 (18 core) processors on all of our tests from now on. The motherboard inside the Supermicro 7048GR-TR Workstation is the Supermicro X10DRG-Q, and we will refer to it in our benchmarks using this moniker. We have also dropped CINEBENCH 11.5 from our benchmarks because it will only use 64 cores/thread, the E5-2699 v3's used here will total 36 cores/72 threads, which exceeds Cinebench 11.5 limits.
The 7048GR-TR Workstation is showing very good results in the multi-core benchmark; in fact, this is the highest we have seen so far, which is no surprise with the extra cores/threads used with the E5-2699 v3s. The single-core results are a little lower, and this is mainly because of power-saving features.
wPrime is a leading multi-threaded benchmark for x86 processors that tests your processor performance. This is a great test to use to rate the system speed; it also works as a stress test to see how well the system's cooling is performing.
In wPrime, the 7048GR-TR Workstation using E5-2699 v3s shows very good 32M scores, but is slight slower in in the 1024M area.
Memory & System Benchmarks
AIDA64 memory bandwidth benchmarks (Memory Read, Memory Write, and Memory Copy) measure the maximum achievable memory data transfer bandwidth.
Memory bandwidth for the 7048GR-TR Workstation is looking very good. Although these are not huge increases of bandwidth, the 7048GR-TR Workstation is showing strong bandwidth numbers. Again, we see all motherboards using the new C612 chipset and Haswell-EP processors are performing very close to each.
Intel Optimized LINPACK Benchmark is a generalization of the LINPACK 1000 benchmark. It solves a dense (real*8) system of linear equations (Ax=b), measures the amount of time it takes to factor and solve the system, converts that time into a performance rate, and tests the results for accuracy.
LINPACK is a measure of a computer's floating-point rate of execution ability and measured in GFLOPS (Floating-point Operations per Second); ten-billion FLOPS = ten GFLOPS. LINPACK is a very heavy compute application that can take advantage of the new AVX2 instruction. As it puts a very high load on the system, it is also a good stress test program.
LINPACK running on dual E5-2699 v3 processors is very impressive, and shows strong bandwidth numbers. This is the highest LINPACK GFLOPS results we have recorded; 885 GFLOPS is very impressive. These speeds, coupled with fast DDR4, should give a real boost to application performance.
Geekbench - Stream
Geekbench 3 is Primate Labs' cross-platform processor benchmark, with a new scoring system that separates single-core and multi-core performance, and new workloads that simulate real-world scenarios. It also includes STREAM based memory tests which we will include on our reviews.
Here we are looking at the single-core STREAM memory tests. Bandwidth is about where it would be expected for single-core applications.
Now we are looking at multi-core STREAM tests. The speeds we are seeing are a little faster than advertised bandwidths for our Crucial Memory kits, which shows these new X10 platforms from Supermicro can squeeze out fast memory bandwidth.
UnixBench and SPEC CPU2006v1.2
UnixBench has been around for a long time now, and is a good general-purpose bench to test on Linux based systems. This is a system benchmark, and it shows the performance of single-threaded and multi-threaded tasks.
Synthetic benchmarks only show part of the performance of a motherboard. When using tests that are more complex, we will start to see a different trend in the scores.
UnixBench starts to show what the 7048GR-TR Workstation can do really well, and that is multi-threaded workloads using the E5-2699 v3 processors.
SPEC CPU2006 v1.2
SPEC CPU2006v1.2 measures compute intensive performance across the system using realistic benchmarks to rate real performance.
In our testing with SPEC CPU2006 we use the basic commands to run these tests.
" Runspec --tune=base --config=tweaktown.cfg ," then " int ," or " fp ."
To do multi-threaded, we add in " --rate=72."
When SPEC CPU first came out, these tests could take up to a week to run, but as computers become faster, our tests now take up to four days for a full run, and even less on some systems. The user can do many things to effect the results of CPU2006 runs, including compiler optimizations, add-ons like Smartheap, and different commands used to start the tests.
This benchmark has many different commands to use depending on what the user is looking for. For our tests, we used basic commands that run a full test with a base tune.
Here you can see the SPEC scores after full runs for Integer (int) and Floating Point (fp) tests.
Single-core runs show how fast (speed) a CPU can perform a given task. In the multi-core runs, we set SPEC CPU2006v1.2 to use all threads to measure the throughput of the system.
The additional cores/threads of this system have a huge impact on performance in these tests and really show the amount of horsepower that a single socket motherboard has. Single-threaded results are still very important, but when you need many single-threaded apps to run; moving to a CPU with more cores is the way to go. The 7048GR-TR Workstation with the E5-2699 v3 processors starts to shine in multi-threaded interger workloads.
Looking at the results of single-threaded integer runs, we can get an idea of speed at which the E5-2699 v3s can crunch through the different integer tests. Not all CPUs are equal here, and ones that have a higher speed will perform these tests faster. Naturally, using an overclocked system, or CPUs with a higher stock speed, will generate higher results.
Now we run the test using all 36 cores/72 threads cores on the E5-2699 v3 processors to measure the throughput of the system. In this test, more cores/threads will have a greater effect on the outcome.
Just like the integer tests, we now run the floating-point tests in single-threaded (speed) mode.
Here we see the results of the multi-core floating-point run that uses all 36 cores/72 threads cores on the E5-2699 v3 processors. Like the multi-threaded integer test, more cores/threads will have a greater impact on the test.
Just like the integer multi-threaded tests, the 7048GR-TR Workstation really takes off here.
Power Consumption & Final Thoughts
We have upgraded our power testing equipment, and now use a Yokogawa WT310 power meter for testing. The Yokogawa WT310 feeds its data through a USB cable to another machine where we can capture the test results.
To test total system power use, we used AIDA64 Stability test to load the CPU, and then recorded the results. We also now add in the power use for a server from off state, to hitting the power button to turn it on, and take it all the way to the desktop. This gives us data on power consumption during the boot up process.
The Supermicro 7048GR-TR workstation is a powerful system with a large number of included components; as a result, it will use a fair amount of power. In our tests, under normal use on the desktop, we saw ~540 watts used.
We ran our power tests using just one 3120A, then two, then three, and then all four, and finally with the 2x E5-2699 v3 processors. Because of the way LINPACK cycles, the graphs were rather confusing, and had plenty of overlapping lines. We drop our graph results to show just one 3120A under full load, then with all four 3120As and the 2x E5-2699 v3s to show the maximum power load of the full system.
With just one 3120A under load, we saw power usage max out at ~700 watts. After putting the full system under max load, we saw total power use almost reaching 1,600 watts. That is a staggering amount of power used, but considering the 7048GR-TR Workstation's load out, it is not surprising.
Booting the 7048GR-TR Workstation also consumes a fair amount of power; we saw this peak out at ~680 watts, with the system settling down to ~540 watts after the boot process was complete.
Previously, we looked at the Supermicro X10DRG-Q (Intel C612) Workstation Motherboard with five PCIe slots. The X10DRG-Q motherboard is the base of the 7048GR-TR Workstation, and provides the ability to use all four 3120A Xeon Phi Coprocessor cards.
When your HPS needs increase, a workstation like the Supermicro 7048GR-TR is well equipped to handle just about any computational need that you can run on it.
The Supermicro 7048GR-TR Workstation is the base unit that can run several different types of expansion cards, depending on your needs. This is a massive workstation designed for HPC applications, or high-end workstation uses. With the base unit itself, you can add a large RAID system for storage needs. We had the building blocks for a 7048GR-TR Workstation here in the lab for a while now, and when the new Haswell-EP platforms came out, we were able to set this up with all current equipment. This setup included installing the X10DRG-Q motherboard, E5-2699 v3 CPUs, and 256GB of Crucial DDR4 RAM.
As we have said before, the case used for the 7048GR-TR Workstation is simply the best in quality, craftsmanship, and features. When you are installing components similar to the ones we used in our system, only the best case will do, and this is it. We have used this case many times in our builds in the past, and it is our go-to case every time. The 7048GR-TR Workstation is designed for maximum uptime with hot-swappable drives and cooling fans, and includes dual redundant power supplies. It also includes ease of upgrades, should you decided to move up to higher-end Xeon Phi cards, or other expansion cards.
In our review of the X10DRG-Q workstation motherboard, we recommended installing the AOC-TBT-DSL5320 Thunderbolt Add-On Card; it will allow remote running of the machine, so it can be installed in a location separate from the user. When this system is up and running under full loads, it can generate a fair amount of heat, and this will make the cooling fans spin up to higher speeds, which can make a fair amount of noise.
When we look at our LINPACK results for one Intel Xeon Phi 3120A Coprocessor, we see GFLOPS results that almost equal many of the dual CPU systems we have tested. The full system generated a staggering 3,737 GFLOPS, which is very impressive. The Xeon Phi 3120As act like a compute multiplier, and take the computational power of an equivalent of five systems, and combine that into one 4U workstation. This could be as much as 10U worth of server space condensed into one 4U machine.
Power uses for this system is also rather high, but consider that a machine like this can replace up to five systems, and it will actually come out to be about equal or less in power consumption.
We only had the 3120A Xeon Phi cards in the lab for a short time, which limited what tests we could run. There we several more we wanted to do, but we simply ran out of time. However, most of the applications that a user would use these cards for are relatively complex and specific to what they will be using them for, and as such, these are far out of the scope of this review. Intel has done a great job of providing resources for developers on all platforms, including Linux and Windows based systems.
Systems like this are the backbone of many HPC infrastructures, and are growing rapidly. Supermicro is well equipped to provide these systems with high-quality, well calculated designs to meet these HPC demands, and the 7048GR-TR Workstation is just one example of what Supermicro has to offer.
|Quality including Design and Build||98%|
|Bundle and Packaging||95%|
|Value for Money||98%|
The Bottom Line: If you need Intel Xeon Phi Coprocessor support in your workstation, the Supermicro 7048GR-TR tower workstation system can power your HPC needs with ease with a maximum of four Phi cards installed.
PRICING: You can find products similar to this one for sale below.
United States: Find other tech and computer products like this over at Amazon.com
United Kingdom: Find other tech and computer products like this over at Amazon.co.uk
Australia: Find other tech and computer products like this over at Amazon.com.au
Canada: Find other tech and computer products like this over at Amazon.ca
Deutschland: Finde andere Technik- und Computerprodukte wie dieses auf Amazon.de