Tuesday, 2 August 2016

Machine Learning: Google and Microsoft Want Every Company to Scrutinize You with AI

When some patients of Dartmouth-Hitchcock Medical Center in New Hampshire step on their bathroom scale at home, Microsoft’s computers know about it. The corporation’s machines also get blood pressure readings taken at home. And they can even listen to calls between nurses and patients to gauge a person’s emotional state. Microsoft’s artificial intelligence software parses that data to try and warn patients and staff of emerging health problems before any human notices.

The hospital is previewing both the future of health care and of Microsoft’s business. It’s using a suite of new “cognitive” services recently added to Microsoft’s cloud computing service, called Azure. The company says renting out its machine-learning technology will unlock new profits, and enable companies of all kinds to subject their data—and customers—to artificial-intelligence techniques previously limited to computing giants.

“Customers are going to mature from classic cloud services to services that use elements of machine learning and AI,” says Herain Oberoi, director of product management at Microsoft, who oversees the company’s cloud machine-learning services. “Every company I talk with has someone extremely senior tasked with thinking about how to make this technology work for them.”

Microsoft’s competitors Google, IBM, and Amazon are making the same bet. Google announced in June that it had invented a new kind of chip to accelerate machine-learning software and make its cloud services more competitive. The company lags Amazon and Microsoft in the cloud market, and CEO Sundar Pichai has said machine-learning services provide a way for Google to differentiate itself. Amazon’s cloud division, Amazon Web Services, launched its first machine-learning cloud services last year, and in June the group’s head, Andy Jassy, pledged to expand them significantly in the coming months.

Amazon and its largest competitors stepped up their investments in machine-learning technology in recent years after breakthroughs in software that can be trained to do tasks such as interpret photos or speech (see “10 Breakthrough Technologies 2013: Deep Learning”).

Some of the first consumer products to take advantage of those breakthroughs were Amazon’s Alexa voice-operated home assistant and Google’s new Photos service, which understands the content of images and has more than 200 million users. Adding machine learning to the cloud services that corporations already use to outsource tasks such as data storage and analysis is seen as another way to extract money from the technology and enhance the very lucrative market. IDC estimates that corporations spent almost $70 billion with cloud providers last year, and predicts that will double before the end of the decade.

Rob Craft, who leads product management for Google's cloud machine-learning offerings, says that most companies are in a position to benefit from machine learning right away because they have a lot of data on hand about their operations, business, and customers. “Our goal is to help them have more direct value from that data,” he says.

The most straightforward of the new services offered by Google and others do things like describe the content of images, transcribe audio files such as phone calls, extract key terms from text, or translate text between languages. Although seen as lagging behind Google in machine-learning technology, Microsoft and IBM have so far rolled out the broadest range of such services, known as APIs.

Microsoft has an API that tries to decipher facial expressions, for example. IBM has one that assesses the personality of the author of text such as social media posts. Marketing company Influential uses it to help brands such as Corona and Red Bull identify the most useful social media users for promotional efforts. Different APIs can be combined. For example, a company could set up a system that spots its logo in social media images, notes the facial expression of any people in the photo, and extracts key terms from any accompanying text.

Many key software components needed to build the kind of machine-learning systems that Google and others hope will be so valuable are free (see “Facebook Joins Stampede of Tech Giants Giving Away AI Technology”). But Jimoh Ovbiagele, cofounder and chief technology officer at startup ROSS Intelligence, which provides software that speeds up legal research to major law firms, says that the time and expense of building and operating a top-notch machine-learning system means many companies are better off renting the technology.

“It makes sense to stand on the shoulders of giants,” says Ovbiagele. ROSS’s ability to understand legal questions is built on IBM’s suite of language processing technology, some of which originated with the Watson computer that beat two Jeopardy! champions in 2011.

Chris Curran, chief technologist with PwC, says most large corporations are still far from ready to spend significantly on machine-learning services, though. He estimates about three quarters are in “watch and learn” mode, waiting to see what these new capabilities offer.

And while the new services from Microsoft and others make it easy for non-technology companies like Dartmouth-Hitchcock Medical Center to use preprogrammed machine-learning systems, the technology is most valuable when customized for an organization’s specific needs, says Curran. Google and Microsoft’s image APIs are good at general assessments, such as whether a photo contains a cat or a skyscraper, for example. But a food manufacturer would get more value from a vision system able to spot specific defects in items on its production line.

All the cloud providers either already offer or have promised ways for customers to train algorithms on their own data, for their own problems. But creating customized artificial intelligence software can only be made so easy, says Curran. “You need to have the right people and expertise, and those are in short supply,” he says.

Full News= https://www.technologyreview.com/s/602037/google-and-microsoft-want-every-company-to-scrutinize-you-with-ai/

Friday, 22 July 2016

Apache Hive : Hive table creating and data loading Example on traffic_violation Data

DataScience99.com 03:06 0 Comment

Hive table creating and data loading Example on traffic_violation Data

create table traffic_violation(date_of_stop String,time_of_stop timestamp,agency String,subagency String,description String,location String,latitude String,longitude float,accident String,belts String,personal String,property_damage String,fatal String,commercial String,hazmat String,commercial_vechicle String,alcohol String,work_zone String,state String,vehicle_Type String,year int,make String,model String,color String,violation_type String,charge bigint,article String,contribted_to_accident String,race String,gender String,driver_city String,driver_state String,dl_state String,arrest_type String,geolocation String)
row format delimited
fields terminated by ','
stored as textfile;

Load data in your table

load data local inpath '/root/manish/Traffic_Violations.csv' overwrite into table traffic_violation;

Downloads file traffic_violation Data

Apache Hadoop : Hive table creating and data loading Example on Crime Data

DataScience99.com 01:06 0 Comment

Hive table creating and data loading Example on Crime Data

create table Crimedata(id int,case_number String,date String, block 

String,iucr int,primary_type String,description String,location_description 

String,arrest String,domestic String,beat int,district int,ward 

int,community_area int,fbi_code int,x_coordinate bigint,y_coordinate 

bigint,year int,update_on timestamp,latitude float,longitude float,location 

float)

row format delimited
fields terminated by ','
stored as textfile;

load data local inpath '/root/data/crimes_-_2001_to_present.csv' overwrite

into table crimedata;

Downloads file Crime Data

Wednesday, 20 July 2016

SoftServe’s relationship with Cloudera will provide customers with real-time big data analytics, high performance in classical structured data analysis, more accurate predictive analytics, and business intelligence and visualisation

DataScience99.com 12:06 0 Comment

SoftServe has joined the Cloudera Connect Partner Program. Cloudera, which is offering a unified platform for big data built around open source Apache Hadoop, is working with SoftServe to help organisations gain a competitive advantage from their data by providing them with data acceleration capabilities for real-time decision-making through professional services.

image: http://www.channelbiz.co.uk/wp-content/uploads/2015/04/Tim-Stevens-of-Cloudera.jpg

Tim Stevens of ClouderaSoftServe’s new relationship with Cloudera will provide customers with real-time big data analytics, high performance in classical structured data analysis, more accurate predictive analytics, business intelligence and visualisation and network configuration optimisation.

“With their unique strengths in professional services, we are pleased to welcome SoftServe to the Cloudera Connect Partner program,” said Tim Stevens (pictured), vice president for corporate and business development at Cloudera. “SoftServe’s professional team of experts turn data into insight and advantage, so now our mutual customers are able to receive end-to-end big data solutions that deliver effective and timely business results.”

“Cloudera is the definitive leader in emerging big data technology for the enterprise, so our two companies working together is a perfect fit for SoftServe’s professional services organisation,” said Neil Fox, EVP and CTO at SoftServe.

SoftServe has longstanding expertise in the various technologies in Cloudera’s big data ecosystem, including Hadoop, HBase and Flume, and an increasing speciality in newer technologies such as Spark. This, combined with depth in analytical tools and languages, including R, Python and Scala, enables SoftServe to “deliver innovative big data solutions”, said Cloudera.

Read more at http://www.channelbiz.co.uk/2015/04/24/cloudera-adds-softserve-pro-services-to-hadoop-platform/#Qsfm8IQ4TyET9HZG.99

Apache Hadoop : Sqoop Script for Importing Data RDBMS to HDFS and RDBMS to HIVE

DataScience99.com 03:33 0 Comment

Sqoop is a tool designed to transfer data between Hadoop and relational database servers. It is used to import data from relational databases such as MySQL, Oracle to Hadoop HDFS, and export from Hadoop file system to relational databases. This is a brief tutorial that explains how to make use of Sqoop in Hadoop ecosystem.

Sqoop: “SQL to Hadoop and Hadoop to SQL”

Sqoop Import

Sqoop import command imports a table from an RDBMS to HDFS. Each record from a table is considered as a separate record in HDFS. Records can be stored as text files, or in binary representation as Avro or Sequence Files.

Importing a RDBMS to HDFS

Syntax:

$ sqoop import --connect --table --username --password --target-dir -m1

--connect Takes JDBC url and connects to database (jdbc:mysql://localhost:3306/test )

--table Source table name to be imported (sqooptest )

--username Username to connect to database (root )

--password Password of the connecting user(12345)

--target-dir Imports data to the specified directory (/output )

--m1

sqoop import --connect jdbc:mysql://localhost:3306/ecafe --table mm01_billing --username root --hive-import --hive-table mm01_billing --target-dir /apps/hive/warehouse/mm01_billing -m 1

 sqoop
 import --connect jdbc:mysql://localhost:3306/ecafe --table mm01_billing
 --username root --hive-import --hive-table mm01_billing --target-dir  
/apps/hive/warehouse/mm01_billing -m 1

HTML Home

DataScience99.com 01:11 0 Comment

<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>

<h1>This is a Heading</h1>
<p>This is a paragraph.</p>

</body>
</html>

About HTML

DataScience99.com 01:06 0 Comment

HTML stands for Hyper Text Markup Language, which is the most widely used language on Web to develop web pages.

HTML was created by Berners-Lee in late 1991 but "HTML 2.0" was the first standard HTML specification which was published in 1995. HTML 4.01 was a major version of HTML and it was published in late 1999. Though HTML 4.01 version is widely used but currently we are having HTML-5 version which is an extension to HTML 4.01, and this version was published in 2012.

Prerequisites

Before proceeding with this tutorial you should have a basic working knowledge with Windows or Linux operating system, additionally you must be familiar with:

Experience with any text editor like notepad, notepad++, or Editplus etc.
How to create directories and files on your computer.
How to navigate through different directories.
How to type content in a file and save them on a computer.
Understanding about images in different formats like JPEG, PNG format.

DataScience99