There have been many approaches for human activity classification relying on accelerometer sensors, or cameras installed in the environment. There is relatively less work using egocentric videos. Accelerometer-only systems, although computationally efficient, are limited in the variety and complexity of the activities that they can detect. For instance, we can detect a sitting event by using accelerometer data, but cannot determine whether the user has sat on a chair or sofa, or what type of environment the user is in. In order to detect activities with more details and context, we present a robust and autonomous method using both accelerometer and ego-vision data obtained from a smart phone. A multi-class Support Vector Machine (SVM) is used to classify activities by using accelerometer data and optical flow vectors. Objects in the scene are detected from camera data by using an Aggregate Channel Features based detector. Another multi-class SVM is used to detect approaching different objects. Then, a Hidden Markov Model (HMM) is employed to detect more complex activities. Experiments have been conducted with subjects performing activities of sitting on chairs, sitting on sofas, and walking through doorways. The proposed method achieves overall precision and recall rates of 95% and 89%, respectively.