Be a lifelong learner with a natural curiosity to figure out how the world works, and an architect with passion to shape the world to come by crafting the next big thing. Don't worry dude, just hacking!
Tomorrow (Jun 30th, 2015) will be my last working day at IBM Research – China. When I decided to join CRL in April 2011, I considered this adventure as a world-leading industry PhD program in the most interesting market – China. Instead of continuing my academic career as a regular PhD with my advisor Prof. Per Stenstrom, I was more interested in making real impact in real business world. Four years later, I would say I was so lucky to have enjoyed a fantastic journey with you who not only helped me, inspired me, encouraged me, mentored me, but also become life long friend with me.
Last week I visited IBM Reserach – Almaden, and I saw a saying on the lobby wall: “Science and data to extend human capability”. IBM Research was no doubt a remarkable organization for disruptive innovation in the human history. I am so proud that I could get the chance to work with you on being essential to our society. After my graduation from IBM Research, I will start a new adventure of building cool big data technology in OneAPM, a startup that has many common interest with me. Hope what I learned from CRL could help me become a person that can shape the world to come in some degree.
Please allow me to take the opportunity to thank you all for all your kind support during the years. Life is a long long journey, we will definitely have chance to meet each other again:)
Please find my contact below and wish you all the best in the future!
简介:IBM Research最近在Big Data领域有很多工作,例如我们组在4月份在10台采用POWER7处理器的P730服务器上成功地用14分钟跑完了1TB数据的排序(7月份又在10台Power7R2上用8分44秒跑完了1TB排序),这项工作已经发表为一篇IBM Research Report,欢迎大家围观,并提出宝贵意见,谢谢。
The use of Big Data underpins critical activities in all sectors of our society. Achieving the full transformative potential of Big Data in this increasingly digital world requires both new data analysis algorithms and a new class of systems to handle the dramatic data growth, the demand to integrate structured and unstructured data analytics, and the increasing computing needs of massive-scale analytics. In this paper, we discuss several Big Data research activities at IBM Research: (1) Big Data benchmarking and methodology; (2) workload optimized systems for Big Data; (3) case study of Big Data workloads on IBM Power systems. In (3), we show that preliminary infrastructure tuning results in sorting 1TB data in 14 minutes on 10 Power 730 machines running IBM InfoSphere BigInsights. Further improvement is expected, among other factors, on the new IBM PowerLinuxTM 7R2 systems.
By: Anne E. Gattiker, Fadi H. Gebara, Ahmed Gheith, H. Peter Hofstee, Damir A. Jamsek, Jian Li, Evan Speight, Ju Wei Shi, Guan Cheng Chen, Peter W. Wong
Published in: RC25281 in 2012
LIMITED DISTRIBUTION NOTICE:
This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.
[2] Wei Xue, JuWei Shi, Bo Yang. X-RIME: Cloud-Based Large Scale Social Network Analysis. Proceedings of 2010 IEEE International Conference on Services Computing.
[3] Kai Shuang, Yin Yang, Bin Cai, Zhe Xiang. X-RIME: HADOOP-BASED LARGE-SCALE SOCIAL NETWORK ANALYSIS. Proceedings of IC-BNMT2010.
IBM Research China is looking for graduate computer science/engineering students who are interested in Hadoop performance optimizations works.
Location: Beijing
Job Tile: Research Intern
Job Openings: 1
Expected Duration: at least 3 months (full-time preferred)
Job responsibilities:
– Write MapReduce program and analyze Hadoop performance model.
– Tune and optimize the performance of Hadoop workloads.
– Publish high quality research papers to report your work.
Requirements:
– Creative and Self-motivated
– Knowledge of Parallel Computing and Distributed Systems.
– Knowledge of Java.
– Familiarity with Linux as development and testing environments.
– Knowledge of Apache Hadoop is a plus.
– Past research experience is a plus.
If you’re interested, please feel free to send your Chinese or English resume with the mail title of “Intern_Your Name_University_Major_Grade” (e.g. Intern_Zhang San_XXU_CS_Master) to chengc_at_cn.ibm.com.
IBM Research China is looking for undergraduate and graduate computer science/engineering students who are interested in Big Data Analytics development and performance optimizations works.
Location: Beijing
Job Tile: Research Intern
Job Openings: 1-2
Expected Duration: at least 3 months (full time)
Job responsibilities:
– Develop text mining solutions such as Topic Detection and Tracking (TDT).
– Write Apache MapReduce user function code to implement Social Network Analysis (SNA), Machine Learning and Text Mining algorithms.
– Tune and optimize the performance of Apache MapReduce/HDFS based analytics workload on POWER7.
– Publish high quality research papers to report your work.
Required skills:
– Knowledge of 1) Parallel Computing and Distributed Systems or 2) Machine Learning and Data Mining
– Knowledge of Java
– Familiarity with Linux as development and testing environments.
– Experience of Apache Hadoop will be a plus.
We also encourage exceptional students to generate and implement their own ideas about Big Data Analytics related works over the course of internship.
If you’re interested, please feel free to drop us an email to jwshi_at_cn.ibm.com with your resume.