Most assessments of children’s screen media time rely on self-reports or on tools originally intended to limit TV viewing. These methods are subject to biases and misclassification errors. Direct measurement, or video-recorded observation with human coding, are better methods of gathering screen use data, but cost and privacy issues are mostly prohibitive. Technological advances (including with respect to automated person detection, facial recognition, and imaging and signal processing algorithms) offer new ways to automatically, objectively measure screen use. This study advanced the development of two versions of the FLASH automatic, privacy-preserving monitoring system: FLASH-TV (to monitor use of large, stationary screens) and FLASH-Mobile (a background app using the front-facing cameras on mobile devices). Human coders were able to identify gaze 83% to 100% of the time, with a mean accuracy of 94.2%. The FLASH system achieved a face detection rate of 94%. Next steps in the FLASH system’s development are (i) to achieve a sub-5% error rate, (ii) to develop algorithms (based on machine learning). for target recognition and gaze/no-gaze estimation from the extracted face images, and (iii) integrate them into a system that can process video data in real time